专利摘要:

公开号:ES2540583T9
申请号:ES02758498.6T
申请日:2002-09-11
公开日:2018-02-22
发明作者:Marta Karczewicz;Antti Hallapuro
申请人:Nokia Technologies Oy;
IPC主号:
专利说明:

5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
DESCRIPTION
Method for interpolation of subpixel values
The present invention relates to a method for interpolation of subpixel values in the encoding and decoding of data. It refers particularly, but not exclusively, to the encoding and decoding of digital video.
Background of the invention
Digital video sequences, such as ordinary motion pictures recorded on film, comprise a sequence of stationary images, creating the illusion of movement by displaying the images, one after the other, at a relatively fast frame rate, usually 15 to 30 frames per second. Due to the relatively fast frame rate, the images in the consecutive frames tend to be very similar and therefore contain a considerable amount of redundant information. For example, a typical scene may comprise some stationary elements, such as the background scenario, and some moving areas, which can take many different forms, for example, the face of a person reading a newspaper, moving traffic and so on. successively. Alternatively, the camera that records the scene itself may be in motion, in which case all elements of the image have the same type of movement. In many cases, this means that the global change between one video frame and the next is rather small. Of course, this depends on the nature of the movement. For example, the faster the movement, the greater the change from one frame to the next. Similarly, if a scene contains a number of moving elements, the change from one frame to the next is likely to be greater than in a scene in which only one element is in motion.
It should be appreciated that each frame of an original digital video sequence, that is, not compressed, comprises a very large amount of image information. Each frame of the uncompressed digital video sequence is formed by an image pixel arrangement. For example, in a commonly used digital video format, known as the common quarter-swap format (QCIF), a frame comprises an arrangement of 176 x 144 pixels, in which case each frame has 25,344 pixels. In turn, each pixel is represented by a certain number of bits, which carry information about the luminance and / or color content of the region of the image corresponding to the pixel. Commonly, a so-called YUV color model is used to represent the luminance and chrominance content of the image. The luminance component, or Y, represents the intensity (brightness) of the image, while the color content of the image is represented by two chrominance components, marked U and V.
Color models based on a luminance / chrominance representation of the image content provide certain advantages compared to color models that are based on a representation that involves primary colors (ie, red, green and blue, RGB). The human visual system is more sensitive to intensity variation than color variations; YUV color models exploit this property by using a lower spatial resolution for chrominance components (U, V) than for the luminance component (Y). In this way, the amount of information needed to encode the color information in an image can be reduced with an acceptable reduction in image quality.
The lowest spatial resolution of chrominance components is generally achieved by subsampling. Normally, a 16 x 16 pixel image block is represented by a 16 x 16 pixel block comprising luminance information, and the corresponding chrominance components are each represented by an 8 x 8 pixel block representing an area of the image equivalent to the 16 x 16 pixels of the luminance component. The chrominance components, therefore, are spatially sampled by a factor of 2 in the x and y directions. The resulting set of a 16 x 16 pixel luminance block and two 8 x 8 pixel chrominance blocks is commonly called a YUV macroblock or macroblock, for brevity.
An image of QCIF comprises 11 x 9 macroblocks. If the luminance blocks and chrominance blocks are represented with a resolution of 8 bits (i.e. by numbers in the range of 0 to 255), the total number of bits required per macroblock is (16 x 16 x 8) + 2 x (8 x 8 x 8) = 3072 bits. The number of bits needed to represent a video frame in QCIF format is therefore 99 x 3072 = 304.128 bits. This means that the amount of data required to transmit / record / display a video sequence in QCIF format, represented by using a YUV color model, at a rate of 30 frames per second, is more than 9 Mbps (millions of bits per second). This is an extremely high data rate and is not practical for use in video recording, transmission and display applications due to the large storage capacity, transmission channel capacity and hardware performance required.
If the video data is to be transmitted in real time by a fixed line network such as an ISDN (Integrated Services Digital Network) or a conventional PSTN (Public Services Telephone Network), the data transmission bandwidth available is normally of the order of 64kbits / s. In mobile videotelephony, where the
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
Transmission takes place at least in part over a radio communications link, the available bandwidth can be as low as 20kbits / s. This means that a significant reduction in the amount of information used to represent video data must be achieved to allow the transmission of digital video sequences in low-bandwidth communication networks. For this reason, video compression techniques have been developed that reduce the amount of information transmitted while retaining acceptable image quality.
Video compression methods are based on the reduction of redundant and significantly irrelevant parts of video sequences. Redundancy in video sequences can be categorized into spatial, temporal and spectral redundancy. "Spatial redundancy" is the expression used to describe the correlation between neighboring pixels within a frame. The term "temporal redundancy" expresses the fact that objects that appear in a frame of a sequence are likely to appear in subsequent frames, while "spectral redundancy" refers to the correlation between different color components of the same image.
Sufficiently efficient compression generally cannot be achieved by simply reducing the various forms of redundancy in a given sequence of images.
Therefore, most current video encoders also reduce the quality of those parts of the video frequency that are subjectively the least important. In addition, the redundancy of the compressed video bit stream is itself reduced by less efficient loss coding. Normally, this is achieved by using a technique known as "variable length coding" (VLC).
Modern video compression standards, such as recommendations H.261, H.263 (+) (++), ITU-T H.26L and MPEG-4 recommendation of the Group of Motion Picture Experts of the "compensated temporal prediction in motion". This is a form of temporary redundancy reduction in which the content of some frames (often many) in a video sequence is "predicted" from other frames in the sequence by tracking the movement of objects or regions of images between frames
Compressed images that do not make use of temporary redundancy reduction are generally called INTRA-encoded frames or I-frames, while temporarily predicted images are called INTER-encoded frames or P-frames. In the case of intercoded frames, the predicted image (compensated for motion) is rarely accurate enough to represent the image content with sufficient quality and, therefore, a spatially compressed prediction (PE) error is also associated with each intercoded frame. Many video compression schemes can also make use of bidirectionally predicted frames, which are commonly referred to as B-images or B-frames. B-images are inserted between pairs of reference images or called "anchor" (images I or P) and are predicted from either or both of the anchor images. B-images are not used by themselves as anchor images, that is, they do not predict other frames from them, and therefore can be discarded from the video sequence without causing deterioration in the quality of future images. .
The different types of frames that occur in a typical compressed video sequence are illustrated in Figure 3 of the accompanying drawings. As can be seen from the figure, the sequence begins with the intracoded frame or frame I, 30. In Figure 3, arrows 33 indicate the "forward" prediction process, by which the P-frames are formed. (marked 34). The bidirectional prediction process by which frames B (36) are formed is indicated by arrows 31a and 31b, respectively.
A schematic diagram of an example video coding system using compensated motion prediction is shown in Figures 1 and 2. Figure 1 illustrates an encoder 10 that employs motion compensation and Figure 2 illustrates a corresponding decoder 20 . The encoder 10 shown in Figure 1 comprises a motion field estimation block 11, a motion field coding block 12, a motion-compensated prediction block 13, a prediction error coding block 14, a prediction error coding block 15, a multiplexing block 16, a frame memory 17 and an aggregator 19. The decoder 20 comprises a prediction block compensated for movement 21. A prediction error coding block 22 , a demultiplexing block 23 and a frame memory 24.
The principle of operation of video encoders that use motion compensation is to minimize the amount of information in a prediction error frame In (x, y), which is the difference between a current frame In (x, y) which is encoded and a prediction plot Pn (x, y). The plot of prediction error is therefore:
In (x, y) = ln (x, y) -Pn (x, y). (one)
The prediction frame Pn (x, y) is constructed by using pixel values of a reference frame Rn (x, y), which is generally one of the previously encoded and transmitted frames, for example, the preceding frame immediately to the current frame and is available from frame memory 17 of encoder 10. More
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
specifically, the prediction frame Pn (x, y) is constructed by finding the so-called "prediction pixels" in the reference frame Rn (x, y) that substantially correspond to pixels in the current frame. Motion information, which describes the relationship (e.g., relative location, rotation, scale, etc.) between pixels in the current frame and their corresponding prediction pixels in the reference frame is derived and the prediction frame is constructed at move prediction pixels based on motion information. In this way, the prediction frame is constructed as an approximate representation of the current frame, by using pixel values in the reference frame. The prediction error frame referred to above, therefore, represents the difference between the approximate representation of the current frame provided by the prediction frame and the current frame itself. The basic advantage provided by video encoders that use compensated motion prediction arises from the fact that the comparatively compact description of the current frame can be obtained by representing it in terms of the movement information required to form its prediction along with the Prediction error information associated in the prediction error frame.
However, due to the very large number of pixels in a frame, it is generally not efficient to transmit separate motion information for each pixel to the decoder. Instead, in most video encoding schemes, the current frame is divided into larger Sk image segments and the movement information related to the segments is transmitted to the decoder. For example, motion information is normally provided for each monoblock in a frame and the same movement information is then used for all pixels within the macroblock. In some video coding standards, such as H.26L, a macroblock can be divided into smaller blocks, each smaller block being provided with its own movement information.
Motion information generally takes the form of motion vectors [Ax (x, y), Ay (x, y)]. The pair of numbers Ax (x, y) and Ay (x, y) represent the horizontal and vertical displacements of a pixel at the site (x, y) in the current plot ln (x, y) with respect to a pixel in the reference frame Rn (x, y). The motion vectors [Ax (x, y), Ay (x, y)] are calculated in the motion field estimation block 11 and the set of motion vectors of the current frame [Ax (), Ay () ] is called the motion vector field.
Normally, the position of a macroblock in a current video frame is specified by the coordinate (x, y) of its upper left corner. Therefore, in a video coding scheme in which motion information is associated with each macroblock of a frame, each motion vector describes the horizontal and vertical displacement Ax (x, y) and Ay (x, y) of a pixel representing the upper left corner of a macroblock in the current frame ln (x, y) with respect to a pixel in the upper left corner of a substantially corresponding block of prediction pixels in the reference frame Rn (x, y) (as shown in Figure 4b).
Movement estimation is a computationally intensive task. Given a reference frame Rn (x, y) and, for example, a square macroblock comprising N x N pixels in a current frame (as shown in Figure 4a), the objective of motion estimation is to find a block of N x N pixels in the reference frame that matches the characteristics of the macroblock in the current image according to certain criteria. The criterion can be, for example, a sum of absolute differences (SAD) between the pixels of the macroblock in the current frame and the block of pixels in the reference frame with which it is compared. This process is generally known as "block matching". It should be noted that, in general, the geometry of the block with which it is to be matched and that of the reference frame do not have to be the same, since real-world objects can suffer scale changes, as well as rotation and distortion However, in current international video coding standards, only one translational motion model is used (see below) and, therefore, the fixed rectangular geometry is sufficient.
Ideally, to achieve the best probability of finding a match, the reference frame should be searched together. However, this is not practical, since it imposes a too high computational load on the video encoder. Instead, the search region is restricted to the [-p, p] region around the original macroblock position in the current frame, as shown in Figure 4c.
To reduce the amount of movement information to be transmitted from the encoder 10 to the encoder 20, the motion vector field is encoded in the motion field coding block 12 of the encoder 10, representing it with a motion model. In this process, motion vectors of image segments are expressed again by using certain predetermined functions or, in other words, the motion vector field represented with a model. Almost all currently used motion vector field models are additive motion models, which comply with the following general formula:
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
1 = O
by (. *. y) = ^ b¡g, (x.y) (3)
<-0
where the coefficients ai and b¡ are called movement coefficients. The motion coefficients are transmitted to the decoder 20 (information stream 2 in Figures 1 and 2). The functions f¡ and g¡ are called base functions of the field of motion, and are known by both the encoder and the decoder. A
Approximate motion vector field can be constructed by using coefficients
and the basic functions. Since the base functions are known (i.e. stored in) by the encoder 10 and the decoder 20, only the movement coefficients need to be transmitted to the encoder, thus reducing the amount of information required to represent the movement information Of the plot.
The simplest movement model is the translation movement model that requires only two coefficients to describe the motion vectors of each segment. The values of motion vectors are given by:
ix (x, y) = a „Ay (x, y) = ¿> 0
(4)
This model is widely used in several international standards (ISO MPEG-1, MPEG-2, MPEG-4, ITU-T recommendations H.261 and H.263) to describe the movement of the 16 x 16 and 8 x blocks 8 pixels Systems that use a translational motion model usually perform a motion estimate at a full pixel resolution or some integral fraction of a full pixel resolution, for example, half or a quarter of the pixel resolution.
The prediction frame Pn (x, y) is constructed in the compensated prediction block in terms of movement 13 in the encoder 10 and is given by:
P „(x.y) = R, [* + Ax (jc, y), y + Ay (x, y)] (5)
In the prediction error coding block 14, the prediction error frame En (x, y) is normally compressed by representing it as a finite (transformed) series of some two-dimensional functions. For example, a two-dimensional discrete cosine (DTC) transform can be used. The coefficients of the transform are quantified and the entropy (for example, of Huffman) is encoded before they are transmitted to the decoder (information stream 1 in Figures 1 and 2). Due to the error introduced by quantification, this operation generally produces some degradation (loss of information) in the prediction error frame En (x, y). To compensate for this degradation, the encoder 10 also comprises an error coding block.
of prediction 15, where a decoded prediction error frame is constructed by using
transformation coefficients The locally decoded prediction error frame is added to the frame of
Pn (x, y) prediction in aggregator 19 and the resulting decoded current frame is stored in memory
of frame 17 for later use as the following reference frame Rn + i (x, y).
The information stream 2 that carries information about the motion vectors is combined with information about the prediction error in the multiplexer 16 and an information stream 3 that normally contains at least those two types of information is sent to the decoder 20.
Next, the operation of the corresponding video decoder 20 is described.
The frame memory 24 of the decoder 20 stores a reference frame Rn (x, y) previously reconstructed. The prediction frame Pn (x, y) is constructed in the compensated prediction block in terms of movement 21 of the decoder 20 according to equation 5, by using information of received motion coefficient and pixel values of the reference frame previously reconstructed Rn (x, y). The transmitted transformation coefficients of the prediction error frame In (x, y) are used in the prediction error decoding block
22 to construct the decoded prediction error frame y) | _os pixels of the current decoded frame are then reconstructed by adding the prediction frame Pn (x, y) and the prediction error frame
decoded
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
7. (jt. Y) = P „(x, y) + E, (x, y) = R. [x + Ar (x, y), y + Ay (*, y) J + E. (x, Y). (6)
This current decoded frame can be stored in frame memory 24 as the following reference frame Rn + i (x, y).
In the description of offset coding and decoding in terms of digital video motion presented above, the motion vector [Ax (x, y), Ay (x, y)] describing the movement of a macroblock in the current frame with respect to to the reference frame Rn (x, y) you can point any of the pixels in the reference frame. This means that the movement between frames of a digital video sequence can only be represented at a resolution determined by the image pixels in the frame (called full pixel resolution). However, the actual movement has arbitrary precision and, therefore, the system described above can only provide an approximate modeling of the movement between successive frames of a digital video sequence. Normally, motion modeling between video frames with full pixel resolution is not accurate enough to allow efficient minimization of prediction error (PE) information associated with each macroblock / frame. Therefore, to allow more accurate modeling of the actual movement and to help reduce the amount of PE information that must be transmitted from the encoder to the decoder, many video coding standards, such as H.263 (+) ( ++) and H.26L, allow motion vectors to point "between" image pixels. In other words, motion vectors may have a "subpixel" resolution. Allowing motion vectors to have sub-pixel resolution adds to the complexity of the encoding and decoding operations that must be performed, so it is still advantageous to limit the degree of spatial resolution that a motion vector can have. Therefore, video coding standards, such as those mentioned above, normally only allow motion vectors that have full pixel, half pixel or quarter pixel resolution.
Motion estimation with sub-pixel resolution is generally performed as a two-stage process, as illustrated in Figure 5, for a video coding scheme that allows motion vectors to have full pixel or half pixel resolution. In the first stage, a motion vector having full pixel resolution is determined using any appropriate motion estimation scheme, such as the block matching process described above. The resulting motion vector, which has full pixel resolution is shown in Figure 5.
In the second stage, the motion vector determined in the first stage is refined to obtain the desired half-pixel resolution. In the example illustrated in Figure 5, this is done by forming eight new 16 x 16 pixel search blocks, the position of the upper left corner of each block is marked with an X in Figure 5. These positions are indicated as [Ax + m / 2, Ay + n / 2], in which myn can adopt the values of -1, 0 and + 1, but cannot be zero at the same time. Since only the pixel values of the original image pixels are known, the values (for example, luminance and / or chrominance values) of the subpixels that reside in half-pixel positions should be estimated for each of the eight search blocks new, using some form of interpolation scheme.
Once the sub-pixel values have been interpolated at a resolution of half a pixel, each of the eight search blocks is compared with the macroblock whose motion vector is searched. As in the block matching procedure performed to determine the motion vector with full pixel resolution, the macroblock is compared to each of the eight search blocks according to a certain criterion, for example, a SAD. As a result of the comparisons, a minimum SAD value will generally be obtained. Depending on the nature of the motion in the video sequence, this minimum value may correspond to the position specified by the original motion vector (which has full pixel resolution) or may correspond to a position that has a resolution of half a pixel. Therefore, it is possible to determine whether a motion vector should point to a full pixel or subpixel position and if a subpixel resolution is appropriate, to determine the correct subpixel resolution motion vector. It should also be appreciated that the scheme just described can be extended to other subpixel resolutions (for example, a quarter pixel resolution) in a completely analogous manner.
In practice, the estimation of a sub-pixel value in the reference frame is made by interpolating the value of the sub-pixel from the surrounding pixel values. In general, the interpolation of a sub-pixel value F (x, y) located in a non-integer position (x, y) = (n + Ax, m + Ay), can be formulated as a two-dimensional operation, represented mathematically as:
f (*. y) = £ £ / (* + K i + L) F <n + * .m + /) (7>
l = -Kl - L
where f (k, l) are filter coefficients and n and m are obtained by truncating x and y, respectively, to integer values. Normally, filter coefficients depend on the values of x and y, and interpolation filters generally
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
they are called "separable filters", in which case the sub-pixel value F (x, y) can be calculated as follows:
tfaL K =
F (x, y) = ¿f (k + K) ¿/ (/ + K) F (n + k, m +1) (8)
Í - IC li-K
Motion vectors are calculated in the encoder. Once the corresponding motion coefficients are transmitted to the decoder, it is a direct matter to interpolate the required subpixels by using an interpolation method identical to that used in the encoder. In this way, a frame that follows a reference frame in frame memory 24 can be reconstructed from the reference frame and motion vectors.
The simplest way to apply interpolation of sub-pixel values in a video encoder is to interpolate each sub-pixel value whenever necessary. However, this is not an efficient solution in a video encoder, since it is likely that the same sub-pixel value will be required several times and therefore the calculations to interpolate the same sub-pixel value will be performed multiple times. This results in an unnecessary increase in complexity / computational load in the encoder.
An alternative approach, which limits the complexity of the encoder, is to previously calculate and store all sub-pixel values in a memory associated with the encoder. This solution is called interpolation "beforehand" hereinafter in this document. Although complexity is limited, interpolation beforehand has the disadvantage of increasing memory usage by a large margin. For example, if the accuracy of the motion vector is a quarter pixel in both horizontal and vertical dimensions, storing precalculated subpixel values for a complete image results in memory usage that is 16 times that required to store the image. original not interpolated. In addition, it involves the calculation of some subpixels that may not be really required in the calculation of motion vectors in the encoder. Interpolation in advance is also particularly inefficient in a video decoder, since most sub-pixel values are previously calculated and will never be required by the decoder. Therefore, it is advantageous not to use a previous calculation in the decoder.
The so-called "on demand" interpolation can be used to reduce memory requirements in the encoder. For example, if the desired pixel accuracy is a quarter pixel resolution, only subpixels at a unit resolution of a medium will be interpolated beforehand for the entire frame and stored in memory. The resolution sub-pixel values of a quarter pixel are only calculated during the motion estimation / compensation process as and when required. In this case, memory usage is only 4 times that required to store the original image, not interpolated.
It should be noted that when interpolation is used beforehand, the interpolation process constitutes only a small fraction of the total computational complexity / load of the encoder, since each pixel is interpolated only once. Therefore, in the encoder, the complexity of the interpolation process itself is not very critical when interpolation of subpixel values is used beforehand. On the other hand, interpolation on demand has a much higher computational load in the encoder, since subpixels can be interpolated many times. Therefore, the complexity of the interpolation process, which can be considered in terms of the number of computational operations or operational cycles that must be performed to interpolate sub-pixel values, becomes an important consideration.
In the decoder, the same sub-pixel values are used a few times at most and some are not necessary at all. Therefore, it is advantageous in the decoder not to use interpolation beforehand at all, that is, it is advantageous not to calculate any sub-pixel values beforehand.
Two interpolation schemes have been developed as part of the ongoing work in the ITU-T Telecommunication Standardization Sector, Study Group 16, Video Coding Expert Group (VCEG), Questions 6 and 15. These approaches are proposed to incorporate in the ITU-T Recommendation H.26L and have been implemented in test values (TML) for evaluation and further development purposes. The test model corresponding to Question 15 is called Test Model 5 (TML5), while the one resulting from Question 6 is known as Test Model 6 (TML6). The interpolation schemes proposed in TML5 and TML6 will not be described.
Throughout the description of the interpolation scheme of subpixel values used in the TML5 test model, reference will be made to Figure 12a, which defines a notation for describing specific pixel and subpixel positions for TML5. A separate notation, defined in Figure 13a, will be used in the discussion of the interpolation scheme of subpixel values used in TML6. An additional notation, illustrated in Figure 14a, will be used later in the text in connection with the method of interpolating sub-pixel values according to the invention. It will be appreciated that the three different notations used in the text are intended to help understand each interpolation method and help distinguish differences between them. However, in all three figures, the letter A is used to indicate original image pixels (full pixel resolution). More specifically, the
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
Letter A represents the pixel position in the image data that represents a frame of a video sequence, the pixel values of the pixels A are received either as the current frame I "(x, y) of a video source, or reconstructed and stored as a reference frame Rn (x, y) in Frame Memory 17, 24 of encoder 10 or decoder 20. All other letters represent subpixel positions, the values of subpixels located at the positions of Subpixels are obtained by interpolation.
Some other expressions will also be used consistently throughout the text to identify particular positions of pixels and subpixels. These are the following:
The expression "horizontal unit position" is used to describe the position of any sub-pixel that is constructed in a column of the original image data. Subpixels c and e in figures 12a and 13a, as well as subpixels b and e in figure 14a have horizontal unit positions.
The expression "vertical unit position" is used to describe any sub-pixel that is constructed in a row of the original image data. Subpixels b and d in figures 12a and 13a as well as subpixels b and d in figure 14a have vertical unit positions.
By definition, pixels A have horizontal unit and vertical unit positions.
The expression "average horizontal position" is used to describe the position of any sub-pixel that is constructed in a column that is at a resolution of half a pixel. The subpixels b, c, and e shown in Figures 12a and 13a fall into this category, and also the subpixels b, c and f of Figure 14a. Similarly, the expression "average vertical position" is used to describe the position of any sub-pixel that is constructed in a row that is at a resolution of half a pixel, such as the sub-pixels c and d in Figures 12a and 13a, as well as the subpixels b, cyg in figure 14a.
In addition, the expression "horizontal position of a room" refers to any sub-pixel that is constructed in a column that is at a resolution of a quarter of a pixel, such as subpixels eyd in Figure 12a, subpixels dyg in Figure 13a and subpixels d, gyh in figure 14a. Similarly, the expression "vertical position of a room" refers to subpixels that are constructed in a row that is at a resolution of a quarter pixel. In figure 12a, the subpixels e and f fall into this category and so are the subpixels e, f and g in figure 13a and the subpixels e, f and h in figure 14a.
The definition of each of the expressions described above is shown by "envelopes" drawn in the corresponding figures.
It should also be noted that it is often convenient to indicate a particular pixel with a two-dimensional reference. In this case, the appropriate two-dimensional reference can be obtained by examining the intersection of the envelopes in Figures 12a, 13a and 14a. When applying this principle, the pixel d in Figure 12a, for example, has a medium horizontal and vertical vertical position and the subpixel e has a horizontal unit and vertical position of a quarter. In addition, and for ease of reference, the subpixels that reside in horizontal half-unit and vertical unit positions, horizontal unit and vertical half-unit positions, as well as horizontal half-unit and vertical half-unit positions, will be called sub-pixels of 1/2 resolution. Subpixels that reside in any horizontal position of a unit room and / or vertical position of a unit room will be called 1/4 resolution subpixels.
It should also be noted that in the descriptions of the two test models and in the detailed description of the invention itself, the pixels will be assumed to have a minimum value of zero and a maximum value of 2 "- 1 where n is the number of bits reserved for a pixel value. The number of bits is normally 8. After a sub-pixel has been interpolated, if the value of the interpolated sub-pixel exceeds the value of 2 "- 1 it is restricted to the interval of [0, 2" - 1], that is, values less than the minimum allowed value will be the minimum value (0) and values greater than the maximum will become the maximum value (2 "- 1). This operation is called trimming.
Next, the interpolation scheme of sub-pixel values according to TML5 is described in detail, with reference to Figures 12a, 12b and 12c.
1. The value for the sub-pixel in a horizontal position of half a unit and vertical position of unit, ie sub-pixel (b) of resolution 1/2 in Figure 12a, is calculated by using a six-branch filter. The filter interpolates a value for sub-pixel b with a resolution of 1/2 based on the values of the 6 pixels (A1 to A6) located in a row at horizontal unit positions and vertical unit positions symmetrically around b, as shown in Figure 12b, according to the formula b = (A1-5A2 + 20A3 + 20A4-5A5 + A6 + 16) / 32. The operator / indicates division with truncation. The result is trimmed so that it is in the interval [0, 2 "- 1].
2. The values for 1/2 resolution subpixels marked with a c are calculated by using the same six-branch filter used in stage 1 and the closest six pixels or subpixels (A or b) in the vertical direction. Referring now to Figure 12c, the filter interpolates a value for the sub-pixel c of resolution 1/2
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
located in the horizontal unit position and the average vertical position based on the values of the 6 pixels (Ai to A6) located in a column in horizontal unit positions and vertical unit positions symmetrically around c, according to the formula c = (Ai - 5A2 + 20A3 + 20A4 - 5A5 + A6 +16) / 32. Similarly a value for sub-pixel c of resolution 1/2 of in the middle horizontal and vertical vertical position is calculated according to c = (b1 - 5b2 + 20b3 + 20b4 - 5b5 + b6 + 16) / 32. Again, the operator / indicates division with truncation. The values calculated for the subpixels c are also trimmed so that they are in the range [0, 2 "- 1].
At this point in the interpolation process the values of all sub-pixels of resolution 1/2 have been calculated and the process proceeds to the calculation of sub-pixel values of resolution 1/4.
3. The values for the 1/4 resolution subpixels marked with the letter d are calculated by using linear interpolation and the values of the closest pixels / or resolution 1/2 subpixels in the horizontal direction. More specifically, the 1/4 resolution d subpixels located in the horizontal positions of a quarter and vertical unit, are calculated by taking the average of the immediately neighboring pixel in a unit position and unit vertical (pixel A) and the resolution sub-pixel 1/2 immediately neighboring in the middle horizontal and vertical unit position (sub-pixel b), that is, according to d = (A + b) / 2. The values for sub-pixels d of resolution% located in horizontal positions of a quarter and average vertical, are calculated by taking an average of the sub-pixels c of resolution 1/2 immediately neighbors that are in a horizontal position of unit and vertical vertical and in positions horizontal mean and vertical mean respectively, that is, according to d = (c + c2) / 2. Again, the operator / indicates division with truncation.
4. The values for the 1/4 resolution subpixels marked with the letter e are calculated using linear interpolation and the values of the closest subpixels and / or 1/2 resolution subpixels in the vertical direction. In particular, the subpixels e of resolution 1/4 in horizontal unit and vertical positions of a quarter are calculated by taking the average of the immediately neighboring pixel in the horizontal unit position and vertical unit position (pixel A) and the subpixel immediately neighbor in the horizontal unit and vertical vertical position (sub-pixel c) according to e = (A + c) / 2. The sub-pixels e3 of% resolution in the horizontal and vertical horizontal positions of a quarter are calculated by taking the average of the immediately neighboring sub-pixel in the average horizontal and vertical unit position (sub-pixel b) and the immediately neighboring sub-pixel in the middle horizontal position and vertical vertical (sub-pixel c), according to e = (b + c) / 2. In addition, the subpixels e of 1/4 resolution in horizontal quarter and quarter quarter positions are calculated by taking the average of the immediately neighboring subpixels in the horizontal quarter and unit vertical position and the corresponding sub pixel in the horizontal position of a quarter and average vertical (subpixels d), according to e = (d1 + d2) / 2. Again, the operator / indicates division with truncation.
5. The value for the resolution sub-pixel f of 1/4 is interpolated by averaging the values of the 4 closest pixel values in the horizontal and vertical unit positions, according to f = (A1 + A2 + A3 + A4 + 2 ) / 4, where pixels A1, A2, A3 and A4 are the closest four original pixels.
A drawback of TML5 is that the decoder is computationally complex. This results from the fact that TML5 uses an approach in which the interpolation of sub-pixel values of 1/4 resolution depends on the interpolation of sub-pixel values of 1/2 resolution. This means that to interpolate the values of the 1/4 resolution subpixels, the values of the 1/2 resolution subpixels from which they were determined must be calculated first. In addition, since the values of some of the 1/4 resolution sub-pixels depend on the interpolated values obtained for other 1/4 resolution sub-pixels, truncation of the 1/4 resolution sub-pixel values has a negative effect on the Accuracy of some of the subpixel values of 1/4 resolution. Specifically, the 1/4 resolution resolution sub-pixel values are less accurate than they would be if they were calculated from values that have not been truncated and trimmed. Another drawback of TML5 is that it is necessary to store the sub-pixel values of 1/2 resolution to interpolate the sub-pixel values of 1/4 resolution. Therefore, excess memory is required to store a result that is not finally required.
Next, the interpolation scheme of sub-pixel values according to TML6, referred to herein as direct interpolation, is described. In the interpolation method encoder according to TML6 it functions as the TML5 interpolation method described above, except that the maximum accuracy is retained everywhere. This is achieved by using intermediate values that are not rounded or trimmed. A step-by-step description of the interpolation method according to TML6 as applied in the encoder is given below with reference to Figures 13a, 13b and 13c.
1. The value for the sub-pixel in the horizontal position of half-unit and vertical of unit, that is, sub-pixel b of resolution 1/2 in Figure 13a, is obtained by first calculating an intermediate value b calculated by using a six-branch filter The filter calculates b, based on the values of the 6 pixels (A1 to A6) located in a row in horizontal unit positions and vertical unit positions symmetrically around b, as shown in Figure 13b, according to formula b = (A1 - 5A2 + 20A3 + 20A4 - 5A5 + A6). The final value of b is then calculated as b (b + 16) / 32 and is trimmed to be in the interval [0.2 "-1]. As before, the operator / indicates division with truncation.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
2. The values for resolution sub-pixels 1/2 marked with the letter c are obtained by first calculating the intermediate values c. With reference to Figure 13c, an intermediate value c for sub-pixel c of resolution 1/2 located in the horizontal unit and vertical vertical position is calculated based on the values of the 6 pixels (A1 to A5) located in a column in horizontal unit positions and vertical unit positions symmetrically around c, according to the formula c = (A1 - 5A2 + 20A3 + 20A4 - 5A5 + A6). The final value for the resolution sub-pixel c of 1/2 located in a horizontal unit and vertical vertical position is calculated according to c = (c + 16) / 32. Similarly, an intermediate value c for the resolution sub-pixel c of ^ in the middle horizontal and vertical vertical position according to c = (b1 - 5b2 + 20b3 + 20b4 - 5b5 + b6). A final value for this sub-pixel c of 1/2 resolution is then calculated according to (c + 512) / 1024. Again, the operator / indicates division with truncation and the values being calculated for subpixels c of resolution 1/2 are then trimmed to be in the range of [0, 2n-1].
3. Values for 1/4 resolution subpixels marked with the letter d are calculated as follows. The values for 1/4 resolution d subpixels located in horizontal quarter and vertical unit positions are calculated from the value of the immediately neighboring pixel in the horizontal unit position and vertical unit position (pixel A) and the intermediate b value calculated in step (1) for the resolution sub-pixel of 1/2 immediately neighboring in the horizontal middle and vertical unit position (resolution sub-pixel b 1/2), according to d = (32A + b + 32) / 64. The values for 1/4 resolution d subpixels located in a quarter and a half vertical horizontal positions are interpolated by using intermediate c values calculated for the immediately neighboring 1/2 resolution c subpixels that are in the horizontal position of unit and average vertical and the mean horizontal and vertical vertical positions respectively, according to d = (32c1 + c2 + 1024) / 2048. Again the operator / indicates division with truncation and the sub-pixel values d of resolution of 1/4 finally obtained are trimmed to be in the interval [0, 2n-1].
4. Values for 1/4 resolution subpixels marked with the letter e are calculated as follows. The values for subpixels e of resolution 1/4 located in horizontal unit and vertical positions of a quarter are calculated from the immediately neighboring pixel value in the horizontal unit and vertical unit position (pixel A) and the intermediate c value calculated in step (2) for the resolution sub-pixel of 1/4 immediately neighbor in the horizontal unit and vertical vertical position, according to e = (32A + c + 32) / 64. The values for 1/4 resolution subpixels e located in a quarter horizontal and vertical horizontal positions are calculated from the intermediate b value calculated in step (1) for the resolution 1/2 sub pixel immediately adjacent to the position horizontal horizontal and vertical unit and the intermediate c value calculated in step (2) for the resolution sub-pixel of 1/2 immediately neighboring in the middle horizontal and vertical vertical position, according to e = (32b + c + 1024) / 2048. Once again, the operator / indicates division with truncation and the sub-pixel resolution values of 1/4 finally obtained are trimmed to be in the interval [0, 2n - 1].
5. The values for 1/4 resolution subpixels marked with the letter g are calculated by using the value of the next closest original pixel A and the intermediate values of the three resolution subpixels of 1/2 nearest neighbors, according to g = (1024A + 32b + 32ci + c2 + 2048) / 4096. As before, the operator / indicates division with truncation and the sub-pixel resolution g values of 1/4 finally obtained are trimmed to be in the range [0.2n-1].
6. The value for the sub-pixel f of 1/4 resolution is interpolated by averaging the values of the closest 4 pixels in horizontal and vertical unit positions according to f = (A1 + A2 + A3 + A4 + 2) / 4, where the positions of the pixels A1, A2, A3 and A4 are the four closest original pixels.
In the decoder, sub-pixel values can be obtained directly by applying filters from six branches in horizontal and vertical directions. In the case of 1/4 sub-pixel resolution, with reference to Figure 13a, the filter coefficients applied to the pixels and subpixels in the vertical unit position are [0, 0, 64, 0, 0, 0] for a set of six pixels A, [1, -5, 52, 20, -5, 1] for a set of six subpixels d, [2, -10, 40, 40, -10, 2] for a set of six subpixels b, and [1, -5, 20, 52, -5, 1] for a set of six subpixels d. These filter coefficients are applied to sets of respective pixels or subpixels in the same row as the values of subpixels that are interpolated.
After applying the filters in the horizontal and vertical directions, the interpolated value c is normalized according to c = (c + 2048) / 4096 and trimmed to be in the interval [0, 2n - 1]. When a motion vector points to an integral pixel position in either the horizontal or vertical direction, many zero coefficients are used. In a practical implementation of TML6. Different branches are used in the software that are optimized for different subpixel cases so there are no multiplications by zero coefficients.
It should be noted that in TML6, the sub-pixel values of 1/4 resolution are obtained directly by using the intermediate values referred to above and are not derived from rounded and trimmed values for sub-pixels of resolution 1/2. Therefore, in obtaining sub-pixel values of 1/4 resolution, it is not necessary to calculate final values for any of the sub-pixels of 1/2 resolution. Specifically, it is not necessary to perform truncation and trimming operations associated with the calculation of final values for the subpixels of
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
1/2 resolution. It is also not necessary to have final values stored for 1/2 resolution subpixels for use in the calculation of 1/4 resolution subpixel values. Therefore, TML6 is computationally less complex than TML5, since fewer truncation and trimming operations are required. However, a drawback of TML6 is that high precision arithmetic is required in both the encoder and the decoder. High precision interpolation requires more silicon area in ASICs and requires more calculations in some CPUs. In addition, the implementation of direct interpolation as specified in TML6 in a way on demand has a high memory requirement. This is an important factor, particularly in integrated devices.
In view of the description presented above, it should be appreciated that due to the different requirements of the video encoder and decoder with respect to subpixel interpolation, there is a significant problem in the development of a method of interpolation of subpixel values capable of providing a satisfactory performance in both the encoder and the decoder. In addition, none of the current test models (TML5, TML6) described above can provide a solution that is optimal for application in both the encoder and the decoder.
US 5,521,642 discloses a simplified decoding system to provide a reduced image frame to a high definition television receiver with a small screen size through the use of DC transformation coefficients. The decoding system selectively decodes and inversely quantifies DC transformation coefficients to produce a set of difference data, each of which represents an average of the pixel differences between a block of two-dimensional pixels of a current frame and a corresponding block of its previous frame. Each of the two-dimensional motion vectors is also decoded and modified to derive pixel data from the previous reduced image frame. The derived pixel data and the average pixel difference value are combined successively to generate the reduced frame.
The published European patent application EP 0 576 290 refers to image signal coding and decoding methods that eliminate a situation such that, when an encoded image of a high definition television system becomes thinner in half in each One of the vertical and horizontal directions and displayed on a television receiver of a lower definition system, the image shown does not exhibit smooth movement due to the loss of an interlaced structure. In an encoder, the image element data is processed by DCT processing to obtain 8 x 8 coefficient data, and the coefficient data is transmitted. In a decoder, of the 8 x 8 coefficient data, only 4 x 4 coefficient data in the upper left corner are sampled and processed by IDCT processing to obtain data from original image elements. In IDCT processing, those of the 4 x 4 coefficient data belonging to the fourth row are replaced by those of the coefficient data of the eighth row of the 8 x 8 coefficient data.
Summary of the invention
According to a first aspect of the invention, a method according to claim 1 is provided.
Preferably, a first and second weights are used in the weighted average referred to in (c), the relative magnitudes of the weights are inversely proportional to the proximity (diagonal in a straight line) of the first and second sub-pixel or pixel for the sub-pixel in the position 1 / 2N horizontal unit and 1 / 2n vertical.
In a situation where the first and second sub-pixels or pixels are located symmetrically with respect to the (equidistant from) the sub-pixel in the horizontal position of 1/2 unit and vertical of 1/2 N, the first and second weights may have equal values.
The first weighted sum of values for subpixels that reside in horizontal positions of 1 / 2N-1 unit and vertical unit in stage b) can be used when a sub-pixel is required in the horizontal position of 1 / 2N-1 unit and vertical of 1/2 unit.
The second weighted sum of sub-pixel values resides in the horizontal unit and vertical positions of 1 / 2n-1 of unit in stage b) can be used when a sub-pixel is required in the horizontal position of 1 / 2N of unit and vertical of 1 / 2N-1 unit.
In one embodiment, when values are required for subpixels in the horizontal 1 / 2N unit and vertical unit positions, and the 1 / 2N horizontal and 1 / 2N-1 horizontal positions, those values are interpolated taking the average of the values of a first pixel or sub-pixel located in a vertical position corresponding to that of the sub-pixel being calculated and the horizontal unit position and a second pixel or sub-pixel located in a vertical position corresponding to that of the sub-pixel being calculated and the horizontal position of 1 / 2N-1 unit.
When the values for subpixels are required in the horizontal unit and vertical positions of 1 / 2N unit,
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
and the horizontal positions of 1 / 2N-1 of unit and vertical positions of 1 / 2N of unit, can be interpolated by taking the average of the values of a first pixel or sub-pixel located in a horizontal position corresponding to that of the sub-pixel that is being calculated and a vertical unit position and a second pixel or sub-pixel located in a horizontal position corresponding to that of the sub-pixel being calculated and the vertical position of 1 / 2N-1 unit.
The values for subpixels in the horizontal positions of 1 / 2N of unit and vertical of 1 / 2N of unit can be interpolated by taking the average of the values of a pixel located in a horizontal position of unit and vertical of unit, and a pixel located in a horizontal position of 1 / 2N-1 unit and vertical of 1 / 2N-1 unit.
The values for subpixels in horizontal positions of 1 / 2N of unit and vertical of 1 / 2N of unit can be interpolated by taking the average of values of a sub-pixel located in a horizontal position of 1 / 2N-1 of unit and vertical of unit, and a sub-pixel located in a horizontal unit and vertical position of 1 / 2N-1 unit.
The values for half of the sub-pixels in the horizontal positions of 1 / 2N of unit and vertical of 1 / 2N of unit can be interpolated by taking the average of a first pair of values of a sub-pixel located in a horizontal position of 1 / 2N- 1 unit and vertical unit and a sub-pixel located in a horizontal unit and vertical position of 1 / 2N-1 unit and values for the other half of the sub-pixels in horizontal positions of 1 / 2N unit and vertical 1 / Unit 2N is interpolated by taking the average of a second pair of values of a pixel located in a horizontal unit and vertical unit position, and a sub-pixel located in a horizontal position of 1 / 2n-1 unit and vertical of 1 / 2N-1 unit.
The values for subpixels in horizontal positions of 1 / 2N of unit and vertical of 1 / 2N of unit are alternately interpolated for such sub-pixel taking the average of a first pair of values of a sub-pixel located in a horizontal position of 1 / 2N-1 of unit and unit vertical, and a sub-pixel located in a horizontal unit and vertical position of 1 / 2N-1 of unit and values and for such a neighboring sub-pixel taking the average of a second pair of values of a pixel located in the horizontal unit and vertical unit position, and a sub-pixel located in a horizontal position of 1 / 2N-1 unit and vertical of 1 / 2N-1 unit.
The horizontal sub-pixels of 1 / 2N unit and vertical 1 / 2N unit can alternately be interpolated in a horizontal direction.
When values are required for some subpixels in horizontal positions of 1 / 2N unit and vertical positions of 1 / 2N unit, these values can be interpolated by taking the average of a plurality of nearest neighboring pixels.
The intermediate value for a sub-pixel that has a 1 / 2N-1 sub-pixel resolution can be used in the calculation of a sub-pixel value that has a 1 / 2N sub-pixel resolution.
In addition, a video coding interpolation method is provided in which an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows that reside in horizontal unit positions and the Pixels in the columns that reside in vertical unit positions are interpolated to generate values for subpixels in fractional horizontal and vertical positions, the method comprising:
a) when the values for subpixels in horizontal half-unit and vertical unit positions, and in horizontal unit and vertical half-unit positions are required, interpolation of those values directly uses weighted sums of pixels that reside in horizontal unit positions and vertical unit;
b) when the values for subpixels in horizontal half-unit and vertical half-unit positions are required, interpolation of those values directly uses a weighted sum of values for sub-pixels that reside in horizontal half-unit and vertical unit positions calculated according to with stage a);
Y
c) when the values for subpixels in horizontal positions of a unit room and vertical of a unit room are required, those values are interpolated taking the average of at least a pair of a first pair of values of a subpixel located in a horizontal position of half unit and vertical of unit, and a sub-pixel located in a horizontal position of unit and vertical of half unit and a second pair of values of a pixel located in a horizontal position of unit and vertical of unit, and a sub-pixel located in a horizontal position of half unit and vertical of half unit.
In addition, a video coding interpolation method is provided in which an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows that reside in horizontal unit positions and the pixels in the columns that reside in vertical unit positions, interpolated to generate values for subpixels in fractional horizontal and vertical positions, fractional horizontal and vertical positions are defined according to 1 / 2x where x is an integer
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
positive that has a maximum value N, the method comprising:
a) when the values for the subpixels in the horizontal 1 / 2N-1 unit and vertical unit positions, and the horizontal unit and vertical 1 / 2N-1 unit positions are required, those values are interpolated directly to the use weighted amounts of pixels that reside in horizontal unit and vertical unit positions;
b) when a value for a sub-pixel in a horizontal sub-pixel and vertical sub-pixel position is required, that value is interpolated directly by using a choice of a first sum of weighted values for sub-pixels located in a vertical position corresponding to that of the sub-pixel that it is being calculated and a second sum of weighted values of subpixels located in a horizontal position corresponding to that of the subpixel being calculated.
The subpixels used in the first weighted sum can be subpixels that reside in horizontal positions of 1 / 2n-1 of unit and vertical of unit and the first weighted sum can be used to interpolate a value for a sub-pixel in the horizontal position of 1 / 2N-1 unit and vertical 1 / 2N unit.
The subpixels used in the second weighted sum can be subpixels that reside in horizontal unit and vertical positions of 1 / 2N-1 unit and the second weighted sum can be used to interpolate a value for a subpixel in horizontal positions of 1 / 2N of unit and vertical of 1 / 2N-1 of unit.
When the values for subpixels in the horizontal positions of 1 / 2N of unit and vertical of 1 / 2N of unit are required, they can be interpolated by taking the average of at least a pair of a first pair of values of a sub-pixel located in a horizontal position of 1 / 2N-1 unit and vertical unit, and a sub-pixel located in a horizontal position of unit and vertical of 1 / 2N-1 unit and a second pair of values of a pixel located in a horizontal position of unit and unit vertical, and a sub-pixel located in a horizontal position of 1 / 2N-1 unit and vertical of 1 / 2N-1 unit.
In the foregoing, N can be equal to an integer selected from a list consisting of the values 2, 3, and 4.
Subpixels in the horizontal position of a unit room are to be interpreted as subpixels that have as their nearest neighbor on the left a pixel in a horizontal unit position and as their nearest neighbor on the right a subpixel in the horizontal position half-unit as well as subpixels that have as their nearest neighbor on the left a sub-pixel in a horizontal unit position and as their nearest neighbor on the right a pixel in a horizontal unit position. Correspondingly, subpixels in the vertical position of a unit room are to be interpreted as subpixels having a pixel in a unit vertical position as the nearest nearest neighbor and a subpixel in a vertical position as its nearest nearest neighbor. half unit as well as subpixels that have as their nearest nearest neighbor a subpixel in a vertical half unit position and as their closest nearest neighbor a pixel in a vertical unit position.
The term "dynamic range" refers to the range of values that sub-pixel values and weighted sums can take.
By preferably changing the dynamic range, whether extending or reducing it, the number of bits used to represent the dynamic range is changed.
In one embodiment of the invention, the method is applied to an image that is subdivided into a number of image blocks. Preferably, each image block comprises four corners, each corner being defined by a pixel located in a horizontal unit and vertical unit position. Preferably, the method is applied to each image block as the block becomes available for interpolation of sub-pixel values. Alternatively, the interpolation of sub-pixel values according to the method of the invention is performed once all the image blocks of an image have become available for interpolation of sub-pixel values. Preferably, the method is used in video coding. Preferably, the method is used in video decoding.
In one embodiment of the invention, when used in coding, the method is carried out as interpolation beforehand, in which the values for all subpixels in half-unit positions and values for all sub-pixels in quarter-unit positions they are calculated and stored before being used later in the determination in a prediction frame during predictive motion coding. In alternative embodiments, the method is carried out as a combination of interpolation beforehand and on demand. In this case, a certain proportion or category of sub-pixel values is calculated and stored before being used in determining a prediction frame and some other sub-pixel values are calculated only when required during motion prediction coding.
Preferably, when the method is used in decoding, the subpixels are only interpolated when their need is indicated by a motion vector.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
According to a further aspect of the invention, an interpolator according to claim 8 is provided.
According to an embodiment of the invention, the interpolator is comprised in a video encoder to encode an image.
The video encoder may comprise a video encoder. You can understand a video decoder. There may be a codec that comprises both an encoder and a video decoder.
According to an embodiment of the invention, the interpolator is comprised in a video encoder for encoding an image, video encoder that is comprised in a communications terminal.
The communications terminal may comprise a video encoder. You can understand a video decoder. Preferably, it comprises a video codec comprising a video encoder and a video decoder.
Preferably, the communications terminal comprises a user interface, a processor and at least one of a transmission block and a receiver block, and a video encoder for encoding an image, the video encoder comprising an interpolator according to the claim 8. Preferably, the processor controls the operation of the transmission block and / or the receiver block and the video encoder.
According to an embodiment of the invention, the interpolator is comprised in a video encoder for encoding an image, video encoder that is comprised in a communications terminal that is comprised, together with a network, in a telecommunications system, in which the telecommunications network and the communications terminal are connected by a communications link over which encoded video can be transmitted.
Preferably, the telecommunications system is a mobile telecommunications system comprising a mobile communications terminal and a wireless network, the connection between the mobile communications terminal and the wireless network is formed by a radio link. Preferably, the network allows the communication terminal to communicate with other communication terminals connected to the network over communication links between the other communication terminals and the network.
According to an embodiment of the invention, the interpolator is comprised in a video encoder for encoding an image, video encoder that is comprised in a network that is comprised, together with a communications terminal, in a telecommunications system, in which the telecommunications network and the communications terminal are connected by a communications link over which encoded video can be transmitted.
In addition, a video encoder is provided to encode an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows reside in horizontal unit positions and the pixels in the columns reside in vertical unit positions, the encoder comprises an interpolator adapted to generate values for subpixels in fractional horizontal and vertical positions, the resolution of the subpixels is determined by a positive integer N, the interpolator is adapted to:
a) interpolate values for subpixels in horizontal positions of 1 / 2N-1 of unit and vertical of unit and horizontal positions of unit and vertical of 1 / 2N-1 of unit directly when using weighted sums of pixels that reside in horizontal positions of unit and vertical unit;
b) interpolate a value for subpixels at a horizontal sub pixel position and vertical sub pixel position directly by using a choice of a first weighted sum of values for subpixels located at a vertical position corresponding to that of the sub pixel being calculated and a second sum weighted values for subpixels located in a horizontal position corresponding to that of the subpixel being calculated.
The interpolator can also be adapted to form the first weighted sum using sub-pixel values that reside in horizontal positions of 1 / 2N-1 unit and vertical unit and use the first weighted sum to interpolate a value for a sub-pixel in one position. horizontal of 1 / 2N-1 unit and a vertical position of 1 / 2N unit.
The interpolator may also be adapted to form the second weighted sum using sub-pixel values that reside in horizontal unit and vertical positions of 1 / 2N-1 unit and to use the second weighted sum to interpolate a value for a sub-pixel in a 1 / 2N unit horizontal position and 1 / 2N-1 unit vertical position.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
The interpolator can also be adapted to interpolate values for subpixels in horizontal positions of 1 / 2n of unit and vertical of 1 / 2N of unit by taking the average of at least a pair of a first pair of values of a sub-pixel located in a horizontal position of 1 / 2n-1 unit and vertical unit, and a sub-pixel located in a horizontal unit position and vertical 1 / 2n-1 unit and a second pair of values of a pixel located in a horizontal position of unit and unit vertical, and a sub-pixel located in a horizontal position of 1 / 2n-1 unit and vertical of 1 / 2n-1 unit.
In addition, a communications terminal is provided which comprises a video encoder for encoding an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows reside in horizontal unit positions and the Pixels in the columns reside in vertical unit positions, the encoder comprises an interpolator adapted to generate values for subpixels in fractional horizontal and vertical positions, the resolution of the subpixels is determined by a positive integer N, the interpolator is adapted to:
a) interpolate values for subpixels in horizontal positions of 1 / 2n-1 of unit and vertical of unit and horizontal positions of unit and vertical of 1 / 2n-1 of unit directly when using weighted sums of pixels that reside in horizontal positions of unit and vertical unit;
b) interpolate a value for subpixels in a horizontal subpixel and vertical subpixel position directly by using a choice of a first weighted sum of values for subpixels located at a vertical position corresponding to that of the subpixel being calculated and a second weighted sum of values for subpixels located in a horizontal position corresponding to that of the subpixel being calculated.
In addition, a telecommunications system is provided comprising a communications terminal and a network, the telecommunications network and the communications terminal are connected by a communications link on which encoded video can be transmitted, the communications terminal comprises a Video encoder for encoding an image comprising pixels arranged in rows and columns and represented by values that have a specified dynamic range, the pixels in the rows reside in horizontal unit positions and the pixels in the columns reside in vertical unit positions, The encoder comprises an interpolator adapted to generate values for subpixels in fractional horizontal and vertical positions, the resolution of the subpixels is determined by a positive integer N, the interpolator is adapted to:
a) interpolate values for subpixels in horizontal positions of 1 / 2n-1 of unit and vertical of unit and horizontal positions of unit and vertical of 1 / 2n-1 of unit directly when using weighted sums of pixels that reside in horizontal positions of unit and vertical unit;
b) interpolate a value for subpixels at a horizontal sub pixel position and vertical sub pixel position directly by using a choice of a first weighted sum of values for subpixels located at a vertical position corresponding to that of the sub pixel being calculated and a second sum weighted values for subpixels located in a horizontal position corresponding to that of the subpixel being calculated.
In addition, a telecommunications system is provided comprising a communications terminal and a network, the telecommunications network and the communications terminal being connected by a communications link over which encoded video can be transmitted, the network comprises a video encoder. to encode an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows reside in horizontal unit positions and the pixels in the columns reside in vertical unit positions, the encoder comprises an interpolator adapted to generate values for subpixels in fractional horizontal and vertical positions, the resolution of the subpixels is determined by a positive integer N, the interpolator is adapted to:
a) interpolate values for subpixels in horizontal positions of 1 / 2n-1 of unit and vertical of unit and horizontal positions of unit and vertical of 1 / 2n-1 of unit directly when using weighted sums of pixels that reside in horizontal positions of unit and vertical unit;
b) interpolate a value for subpixels at a horizontal sub pixel position and vertical sub pixel position directly by using a choice of a first weighted sum of values for subpixels located at a vertical position corresponding to that of the sub pixel being calculated and a second sum weighted values for subpixels located in a horizontal position corresponding to that of the subpixel being calculated.
According to a further aspect of the invention, a computer program according to claim 19 is provided.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
Brief description of the figures
Next, an embodiment of the invention is described by way of example only with reference to the accompanying drawings, in which:
Figure 1 shows a video encoder according to the prior art; Figure 2 shows a video decoder according to the prior art; Figure 3 shows the types of frames used in video coding; Figures 4a, 4b and 4c show the stages in the block matching; Figure 5 illustrates the motion estimation process for sub-pixel resolution;
Figure 6 shows a terminal device comprising a video coding and decoding equipment in which the method of the invention can be implemented;
Figure 7 shows a video encoder according to an embodiment of the present invention; Figure 8 shows a video decoder according to an embodiment of the present invention; Figures 9 and 10 have not been used and any of those figures should be discarded;
Figure 11 shows a schematic diagram of a mobile telecommunications network in accordance with an embodiment of the present invention;
Figure 12a shows a notation for describing pixel positions and subpixels specific for TML5; Figure 12b shows interpolation of medium resolution subpixels; Figure 12c shows interpolation of medium resolution subpixels;
Figure 13a shows a notation for describing pixel and subpixel positions specific to TML6; Figure 13b shows interpolation of medium resolution subpixels; Figure 13c shows interpolation of medium resolution subpixels;
Figure 14 shows a notation for describing pixel positions and subpixels specific to the invention; Figure 14b shows interpolation of medium resolution subpixels according to the invention; Figure 14c shows interpolation of medium resolution subpixels according to the invention; Figure 15 shows possible diagonal interpolation choices for subpixels;
Figure 16 shows the medium resolution subpixel values required to calculate other medium resolution subpixel values;
Figure 17a shows the medium resolution sub-pixel values that should be calculated to interpolate values for sub-pixels of a quarter resolution in an image block using the TML5 interpolation method;
Figure 17b shows the sub-resolution values of medium resolution that must be calculated to interpolate values for sub-pixels of a quarter of a resolution in a fourth image by using the interpolation method according to the invention;
Figure 18a shows the half-resolution sub-pixel numbers that must be calculated to obtain values for sub-pixels of a quarter of resolution within an image block by using the method of interpolating sub-pixel values according to TML5;
Figure 18b shows the numbers of medium-resolution subpixels that must be calculated to obtain values for sub-pixels of a quarter of resolution within an image block by using the method of interpolating sub-pixel values according to the invention;
Figure 19 shows a numbering scheme for each of the subpixel positions;
Figure 20 shows the nomenclature used to describe pixels, medium resolution subpixels, subpixels of
a quarter of resolution and subpixels of an eighth resolution;
Figure 21a shows the diagonal direction for use in the interpolation of each sub pixel of an eighth resolution in an embodiment of the invention;
Figure 21b shows the diagonal direction to be used in the interpolation of each sub pixel of an eighth resolution in another embodiment of the invention; Y
Figure 22 shows the nomenclature used to describe subpixels of an eighth resolution within an image block.
Detailed description
Figures 1 to 5, 12a, 12b, 12c, 13a, 13b, and 13c have been described above.
Figure 6 shows a terminal device comprising a video coding and decoding equipment that can be adapted to operate in accordance with the present invention. More precisely, the figure illustrates a multimedia terminal 60 implemented in accordance with ITU-T Recommendation H.324. The terminal can be considered as a multimedia transceiver device. It includes elements that capture, encode and multiplex multimedia data streams for transmission through a communications network, as well as elements that receive, demultiplex, decode and display received multimedia content. ITU-T Recommendation H.324 defines the overall operation of the terminal and refers to other recommendations that govern the operation of its various constituent parts. This type of multimedia terminal can be used in real-time applications such as conversation video telephony or non-real-time applications such as recovery / transmission in video clip packets, for example, from a multimedia content server on the Internet.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
In the context of the present invention, it should be appreciated that the H.324 terminal shown in Figure 6 is only one of a number of alternative multimedia terminal implementations suitable for the application of the method of the invention. It should also be noted that there are a number of alternatives related to the position and implementation of the terminal equipment. As illustrated in Figure 6, the multimedia terminal may be located in a communications equipment connected to a landline telephone network such as an analog public computer telephone network (PSTN). In this case, the multimedia terminal is equipped with a modem 71, in accordance with ITU-T recommendations V.8, V.34 and optionally V.8bis. Alternatively, the multimedia terminal can connect an external modem. The modem allows the conversion of multiplexed digital data and control signals produced by the multimedia terminal into an analog form suitable for transmission in the PSTN. In addition, it allows the multimedia terminal to receive data and control signals in analog form from the PSTN and convert them into a stream of digital data that can be demultiplexed and processed in an appropriate manner by the terminal.
An H.324 multimedia terminal can also be implemented in such a way that it can connect directly to a digital fixed line network, such as an integrated services digital network (ISDN). In this case, modem 71 is replaced by an ISDN user network interface. In Figure 6, this ISDN user-network interface is represented by the alternative block 72.
H.324 multimedia terminals can also be adapted for use in mobile communications applications. If used with a wireless communication link, modem 71 may be replaced by any appropriate wireless interface, as represented by block 73 in Figure 6. For example, an H.324 / M multimedia terminal may include a radio transceiver. radio that allows its connection to the current 2nd generation GSM mobile phone network, or to the third generation universal mobile phone system (UMTS).
It should be noted that, in multimedia terminals designed for two-way communication, that is, for transmission and reception of video data, it is advantageous to provide both a video encoder and a video decoder implemented in accordance with the present invention. That pair of encoder and decoder is often implemented as an individual combined functional unit, referred to as a "codec."
Since a video encoder according to the invention performs offset video encoding in terms of movement at a sub-pixel resolution by using a specific interpolation scheme and a particular combination of interpolation of sub-pixel values in advance and on demand, It is generally necessary for a video decoder of a receiving terminal to be implemented in a manner compatible with the encoder of the transmission terminal that formed the stream of compressed video data. Failure to ensure this compatibility may have an adverse effect on the quality of motion compensation and the accuracy of reconstructed video frames.
Next, a typical H.324 multimedia terminal is described in more detail with reference to Figure 6.
The multimedia terminal 60 includes several elements called "terminal equipment". This includes video, audio and telematic devices, generally indicated by reference numbers 61, 62 and 63, respectively. Video equipment 61 may include, for example, a video camera for capturing video images, a monitor for displaying received video content and an optional video processing equipment. Audio equipment 62 typically includes a microphone, for example, to capture spoken messages, and a speaker to reproduce the received audio content. The audio equipment may also include additional audio processing units. The telematic equipment 63 may include a data terminal, keyboard, electronic board or a stationary image transceiver, such as a fax unit.
The video equipment 61 is coupled to a video codec 65. The video codec 65 comprises a corresponding video encoder and video decoder, both implemented in accordance with the invention. Said encoder and decoder will be described below. Video codec 65 is responsible for encoding captured video data in a manner suitable for further transmission on a communications link and decoding compressed video content received from the communications network. In the example illustrated in Figure 6, the video codec is implemented in accordance with ITU-T Recommendation H.263, with appropriate modifications to implement the interpolation method of sub-pixel values according to the invention both in the encoder as in the video codec decoder.
Similarly, the audio equipment of the terminal is coupled to an audio codec, indicated in Figure 6 with the reference number 66. Like the video codec, the audio codec comprises an encoder / decoder pair. It converts the audio data captured by the terminal audio equipment in a form suitable for transmission over the communications link and transforms the encoded audio data received from the network back to a form suitable for reproduction, for example, in the speaker of the terminal. The output of the audio codec goes to a delay block 67. This compensates for the delays introduced by the video encoding process and, therefore, ensures synchronization of audio and video content.
The system control block 64 of the multimedia terminal controls end-to-network signaling using a
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
appropriate control protocol (signaling block 68) to establish a common mode of operation between a transmission terminal and a reception terminal. The signaling block 68 exchanges information about the encoding and decoding capabilities of the transmission and reception terminals and can be used to allow the various encoding modes of the video encoder. System control block 64 also controls the use of data encryption. Information regarding the type of encryption for use in data transmission is passed from the encryption block 69 to the multiplexer / demultiplexer (MUX / DMUX unit) 70.
During data transmission from the multimedia terminal, the MUX / DMUX 70 unit combines encoded and synchronized video and audio streams with data input from the telematic equipment 63 and possible control data, to form a single bit stream. Information regarding the type of data encryption (if any) that is to be applied to the bit stream, provided by the encryption block 69, is used to select an encryption mode. Correspondingly, when a possibly encrypted multiplexed multimedia bit stream is received, the MUX / DMUX 70 unit is responsible for bitstream encryption, dividing it into its constituent multimedia components and passing those components to the codec (s) and / or the appropriate terminal equipment (s) for decoding and reproduction.
It should be noted that the functional elements of the multimedia terminal, video encoder, decoder and video codec according to the invention can be implemented as dedicated software or hardware, or a combination of the two. The video encoding and decoding methods according to the invention are particularly suitable for implementation in the form of a computer program comprising machine-readable instructions for performing the functional steps of the invention. As such, the encoder and decoder according to the invention can be implemented as a software code stored in a storage medium and executed on a computer, such as a personal computer, to provide that computer with video encoding / decoding functionality. .
If the multimedia terminal 60 is a mobile terminal, that is, if it is equipped with a radio transceiver 73, those skilled in the art will understand that it may also comprise additional elements. In one embodiment, it comprises a user interface that has a screen and a keyboard, which allow a user to operate the multimedia terminal 60, together with necessary functional blocks that include a central processing unit, such as a microprocessor, that controls the blocks responsible for different functions in the multimedia terminal, a random RAM access memory, a read-only ROM memory, and a digital camera. The operating instructions of the microprocessor, that is, a program code corresponding to the basic functions of the multimedia terminal 60, are stored in the read-only memory ROM and can be executed as required by the microprocessor, for example, under the control of the user. According to the program code, the microprocessor uses the radio transceiver 73 to form a connection with a mobile communication network, which allows the multimedia terminal 60 to transmit information and receive information from the mobile communication network in a radio path. .
The microprocessor monitors the status of the user interface that controls the digital camera. In response to a user command, the microprocessor instructs the camera to record digital images in RAM. Once the image is captured, or alternatively during the capture procedure, the microprocessor segments the image into image segments (for example, macroblocks) and uses the encoder to perform motion-compensated coding for the segments in order to generate a sequence of compressed images as explained in the description above. A user can order the multimedia terminal 60 to display the captured images on its screen or send the sequence of compressed images using the radio transceiver 73 to another multimedia terminal, a videophone connected to a fixed line network (PSTN) or some other device Telecommunications In a preferred embodiment, the transmission of image data begins as soon as the first segment is encoded, whereby the receiver can initiate a corresponding decoding process with a minimum delay.
Fig. 11 is a schematic diagram of a mobile telecommunications network according to an embodiment of the invention. The MS multimedia terminals are in communication with the BTS base stations via a radio link. BTS radio stations are also connected, through the so-called Abis interface, to a BSC base station controller, which controls and manages several base stations.
The entity formed by a number of BTS base stations (usually a few dozen base stations) and a single BSC base station controller, which controls the base stations, is called a BSS base station subsystem. Particularly, the base station controller BSC manages radio communication channels and transfers. The base station controller BSC is also connected, via the so-called interface A, to a mobile services switching center MSC, which coordinates the formation of connections to and from mobile stations. An additional connection is made through the mobile services switching center MSC, outside the mobile communication network. Outside the mobile communication network, another network (s) connected to the mobile communication network can also reside via a GTW gateway (s), for example, the Internet or a public computerized telephone network (PSTN) In that external network, or within the telecommunications network, there may be localized decoding or video coding stations, such as PC computers. In an embodiment of the invention, the mobile telecommunications network comprises a server
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
VSRVR video to provide video data to an MS that subscribes to that service. Video data is compressed by using the motion compensated video compression method as described above. The video server can function as a gateway to an online video source or it can comprise previously recorded video clips. Typical video telephony applications may involve, for example, two mobile stations or an MS mobile station and a videophone connected to the PSTN, a PC connected to the Internet or an H.261 compatible terminal connected to either the Internet or the PSTN .
Figure 7 shows a video encoder 700 according to an embodiment of the invention. Figure 8 shows a video decoder 800 according to an embodiment of the invention.
The encoder 700 comprises an input 701 for receiving a video signal from a camera or other video source (not shown). In addition, it comprises a DCT transformer 705, a quantizer 706, a reverse quantizer 709, a reverse DCT transformer 710, combiners 712 and 716, a pre-pixel sub-interpolation block 730, a frame store 740 and an interpolation block of on-demand subpixels 750, implemented in combination with a motion estimation block 760. The encoder also comprises a motion field coding block 770 and a motion compensated prediction block 780. Switches 702 and 714 are cooperatively operated by a control manager 720 to switch the encoder between an INTRA video encoding mode and an INTER video encoding mode. The encoder 700 also comprises a multiplexer unit (MUX / RMUX) 790 to form a single bit stream from numerous types of information produced by the encoder 700 for subsequent transmission to a remote receiver terminal or, for example, for storage on a mass storage medium, such as a computer hard drive (not shown).
It should be noted that the presence and implementations of the sub-pixel interpolation block in advance 730 and the interpolation block of sub-pixel values on demand 750 in the encoder architecture depend on the way in which the sub-pixel interpolation method of in accordance with the invention. In embodiments of the invention in which the interpolation of sub-pixel values is not performed beforehand, the encoder 700 does not comprise an interpolation block of sub-pixel values in advance 730. In other embodiments of the invention, only the interpolation of subpixels in advance and therefore the encoder does not include interpolation block of sub-pixel values on demand 750. In embodiments in which both interpolation of sub-pixel values are performed in advance and on demand, both blocks 730 and 750 are present in the encoder 700.
Next, the operation of the encoder 700 according to the invention is described in detail. In the description, it will be assumed that each frame of uncompressed video, received from the video source at input 701, is received and processed on a macroblock basis per block, preferably in a background scan order. It will also be assumed that when the encoding of a new video sequence begins, the first frame of the sequence is encoded in INTRA mode. Subsequently, the encoder is programmed to encode each frame in INTER format, unless one of the following conditions is met: 1) it is judged that the current frame that is encoded is so different from the reference frame used in its prediction that produces excess production error information; 2) a predefined INTRA frame repeat interval has expired; or 3) feedback is received from a receiving terminal indicating a request for a frame to be encoded in INTRA frame.
The appearance of condition 1) is detected by monitoring the output of combiner 716. Combiner 716 forms a difference between the current macroblock of the frame being encoded and its prediction, produced in the motion compensated prediction block 780. Yes a measurement of this difference (for example, a sum of absolute differences of pixel values) exceeds a predetermined threshold, the combiner 716 informs the control administrator 720 via a control line 717 and the control administrator 720 operates the switches 702 and 714 to switch encoder 700 to INTRA encoding mode. The appearance of condition 2) is monitored by means of a time controller or frame counter implemented in the control manager 720, such that, if the time expires, or the frame counter reaches a predetermined number of frames, Control Manager 720 operates switches 702 and 714 to switch the encoder to INTRA encoding mode. Condition 3) is activated if the control administrator 720 receives a feedback signal from, for example, a receiving terminal, through a control line 718 indicating that the receiving terminal requires an INTRA frame renewal. That condition could arise, for example, if a previously transmitted frame was severely corrupted by interference during its transmission, making it impossible to decode it in the receiver. In this situation, the receiver would issue a request for the next frame to be encoded to the INTRA format, thereby reinitializing the coding sequence.
It will also be assumed that the encoder and decoder are implemented in such a way that they allow the determination of motion vectors with a spatial resolution of a resolution of up to a quarter pixel. As will be seen below, finer levels of resolution are also possible.
Next, the operation of the encoder 700 in INTRA encoding mode is described. In INTRA mode, control manager 720 operates switch 702 to accept video input from the line of
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
input 719. The video signal input is received macroblock by macroblock from input 701 through line 719 and each original image pixel macroblock is transformed into DCT coefficients by the DCT 705 transformer. DCT is then passed to quantifier 706, where they are quantified by using a quantization parameter QP. The selection of the quantization parameter QP is controlled by the control manager 720 through the control line 722. Each macroblock transformed and quantified by DCT constituting the image information encoded 723 by INTRA of the frame is passed from the quantizer 706 up to the MUX / DMUX 790. The MUX / DMUX 790 combines the image information encoded by INTRA with possible control information (for example, header data, quantization parameter information, error correction data, etc.) to form a single stream of 725 encoded image information bits. Variable length encoding (VLC) is used to reduce the redundancy of the compressed video bit stream, as is known to those skilled in the art.
A locally decoded image is formed in the encoder 700 by passing the data output through the quantizer 706 through the reverse encoder 709 and applying a reverse DCT transformation 710 to the inverse quantized data. The resulting data is then entered into combiner 712. In the INTRA mode, switch 714 is set so that the input to combiner 712 through switch 714 is set to zero. Thus, the operation performed by the combiner 712 is equivalent to passing the decoded image data formed by the inverse quantizer 709 and the inverse DCT transformation 710 unaltered.
In embodiments of the invention in which interpolation of sub-pixel values is performed in advance, the output of combiner 712 is applied to the sub-pixel interpolation block in advance 730. The input to the interpolation block of sub-pixel values in advance 730 adopts the Decoded image block shape. In the interpolation block of sub-pixel values in advance 730, each decoded macroblock is subjected to sub-pixel interpolation, such that a predetermined subset of sub-pixel sub-pixel resolution values is calculated according to the interpolation method of the invention. and is stored together with the decoded pixel values in the 740 frame store.
In embodiments in which sub-pixel interpolation is not performed beforehand, the sub-pixel interpolation block in advance is not present in the encoder architecture and the output of combiner 710, which comprises decoded image blocks, is applied directly to the warehouse. of frames 740.
As subsequent macroblocks of the current frame are received and go through the encoding and decoding steps described above in blocks 705, 706, 709, 710, 712, a decoded version of INTRA frames accumulates in the frame store 740. When the last macroblock of the current frame has been INTRA encoded and subsequently decoded, the frame store 740 contains a completely decoded frame, available for use as a prediction reference frame in the encoding of a video frame subsequently received in the INTER format. In embodiments of the invention in which interpolation of sub-pixel values is performed in advance, the reference frame maintained in the frame store 740 is at least partially interpolated at a sub-pixel resolution.
Next, the operation of the encoder 70 in the INTER coding mode is described. In the INTER coding mode, the control administrator 720 operates the switch 702 to receive its input from the line 721, which comprises the output of the combiner 716. The combiner 716 forms a prediction error information representing the difference between the macroblock. current of the frame being encoded and its prediction, produced in the motion compensated prediction block 780. The prediction error information is transformed by DCT in block 705 and quantified in block 706 to form a macroblock of information of prediction error transformed by DCT and quantified. Each prediction error macroblock transformed by DCT and quantified is passed from quantifier 706 to the MUX / DMUX 790 unit. The MUX / DMUX 790 unit combines prediction error information 723 with movement coefficients 724 (described a continued) and control information (eg, header data, quantization parameter information, error correction data, etc.) to form a single stream of coded image information bits 725.
Locally decoded prediction error information for each macroblock of the encoded INTER frame is then formed in encoder 700 by passing encoded prediction error information 723 sent by quantizer 706 through inverse quantizer 709 and applying a transformation by DCT in block 710. The locally decoded macroblock resulting from the prediction error information is then entered into combiner 712. In INTER mode, switch 714 is set so that combiner 712 also receives predicted macroblocks in motion for the Current INTER frame, produced in the motion compensated prediction block 780. Combiner 712 combines these two pieces of information to produce reconstructed image blocks of the current INTER frame.
As described above, when encoded INTRA frames are considered, in embodiments of the invention in which an interpolation of sub-pixel values is performed in advance, the output of combiner 712 is applied to the sub-pixel interpolation block in advance 730. Therefore, the input to the interpolation block of sub-pixel values beforehand 730 in the INTER coding mode also takes the form
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
of decoded image blocks. In the interpolation block of sub-pixel values in advance 730, each decoded macroblock is subjected to sub-pixel interpolation, such that a predetermined subset of sub-pixel values is calculated according to the interpolation method of the invention and stored together with the decoded pixel values in frame store 740. In embodiments in which no sub-pixel interpolation is performed beforehand, the sub-pixel interpolation block in advance is not present in the encoder architecture and the combiner output 12, comprising decoded image blocks, it is applied directly to frame store 740.
As the subsequent macroblocks of the video signal are received from the video source and go through the encoding and decoding steps previously described in blocks 705, 706, 709, 710, 712, an encoded version of the INTER frame is builds in frame store 740. When the last macroblock in the frame has been INTER encoded and subsequently decoded, frame store 740 contains a completely decoded frame, available for use as a prediction reference frame in the coding of a frame of video subsequently received in INTER format. In embodiments of the invention in which interpolation of sub-pixel values is performed in advance, the reference frame maintained in the frame store 740 is at least partially interpolated at a sub-pixel resolution.
Next, the formation of a macroblock prediction of the current frame is described.
Any frame encoded in INTER format requires a reference frame for compensated prediction in motion. This means, among other things, that when a video sequence is encoded, the first frame to be encoded, be it the first frame in the sequence or some other frame, must be encoded in INTRA format. This, in turn, means that when the control manager 720 switches the video encoder 700 to the INTER encoding mode, a complete reference frame, formed by decoding a previously encoded frame locally, is already available in the frame store 740 of the encoder. In general, the reference frame is formed by local decoding, either of an encoded INTRA frame or an encoded INTER frame.
The first stage in the formation of a prediction for a macroblock of the current frame is made by the motion estimation block 760. The motion estimation block 760 receives the current macroblock of the frame that is encoded through line 727 and performs a block matching operation to identify a region in the reference frame that substantially corresponds to the current macroblock. In accordance with the invention, the block matching process is performed at a sub-pixel resolution in a manner that depends on the implementation of the encoder 700 and the degree of sub-pixel interpolation performed beforehand. However, the basic principle on which the block matching process is based is similar in all cases. Specifically, the motion estimation block 760 matches the blocks by calculating the difference values (for example, the sum of the absolute differences) that represent the difference in pixel values between the macroblock of the current frame under examination and the regions Best-match pixel / sub-pixel candidates in the reference frame. A difference value is produced for all possible displacements (for example, x-displacements, and with precision of a quarter or an eighth sub-pixel) between the macroblock of the current frame and the candidate test region within a predefined search region of the reference frame and the motion estimation block 760 determines the smallest calculated difference value. The offset between the macroblock in the current frame and the candidate test region of pixel values / sub-pixel values in the reference frame that produces the smallest difference value defines the motion vector for the macroblock in question. In certain embodiments of the invention, an initial estimate for the motion vector having unit pixel accuracy is determined first and then refined to a finer level of subpixel accuracy, as described above.
In embodiments of the encoder in which no interpolation of sub-pixel values is performed in advance, all sub-pixel values required in the block matching process are calculated in the interpolation block of sub-pixel values on demand 750. The estimation block Motion 760 controls the interpolation block of sub-pixel values on demand 750 to calculate each sub-pixel value needed in the block matching process in a manner on demand, as long as required. In this case, the motion estimation block 760 can be implemented to match blocks as a one-stage process, in which case it is searched directly, or a motion vector with the desired subpixel resolution can be implemented to perform a block matching as a two stage process. If the two-stage process is adopted, the first stage may comprise a search, for example, of a full pixel or half pixel resolution motion vector and the second stage is performed to refine the motion vector at subpixel resolution. desired. Since block matching is an exhaustive process, in which the blocks of nxm pixels in the current frame are compared one by one with the blocks of nxm pixels or subpixels in the interpolated reference frame, it should be appreciated that it might be necessary that a sub-pixel calculated in a manner on demand by the interpolation block of pixels on demand 750 must be calculated many times as successive difference values are determined. In a video encoder, this approach is not as efficient as possible in terms of complexity / computational load.
In embodiments of the encoder that use only interpolation of subpixel values in advance, the match
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
Block can be performed as a one-step process, since all sub-pixel values of the reference frame required to determine a motion vector with the desired sub-pixel resolution are calculated in advance in block 730 and stored in the frame store 740. Therefore, they are directly available for use in the block matching process and can be retrieved as required from frame store 740 by the motion estimation block 760. However, even in the case where that all sub-pixel values are available in frame store 740, it is still more computationally efficient to perform a block matching as a two-stage process, since less difference calculations are required. It should be noted that, although interpolation of full subpixel values beforehand reduces computational complexity in the encoder, it is not the most efficient approach in terms of memory consumption.
In embodiments of the encoder in which both interpolation of sub-pixel values are used in advance and on demand, the motion estimation block 760 is implemented in such a way that it can retrieve sub-pixel values previously calculated in the interpolation block of values of 730 subpixels in advance and stored in frame store 740 and also control the interpolation block of sub-pixel values on demand 750 to calculate any additional sub-pixel values that may be required. The block matching process can be performed as a single stage or two stage process. If a two-stage implementation is used, the pre-calculated sub-pixel values retrieved from the frame store 740 can be used in the first stage of the process and the second stage can be implemented to use sub-pixel values calculated by the interpolation block of 750 sub-pixel values on demand 750. In this case, it may be necessary to calculate certain sub-pixel values used in the second stage of the block matching process many times as successive comparisons are made, but the number of those duplicate calculations is significantly smaller that if the calculation of sub-pixel values was not used beforehand. In addition, memory consumption is reduced with respect to embodiments in which only interpolation of sub-pixel values is used beforehand.
Once the motion estimation block 760 has produced a motion vector of the macroblock of the current frame under examination, it sends the motion vector to the field coding block 770. The motion field coding block 770 then approaches to the motion vector received from the motion estimation block 760 through the use of a motion model. The movement model generally comprises a set of basic functions. More specifically, the motion field coding block 770 represents the motion vector as a set of coefficient values (known as motion coefficients) that, when multiplied by the basic functions, form an approximation of the motion vector . Motion coefficients 724 are passed from the motion field coding block 770 to the motion compensated prediction block 780. The motion compensated prediction block 780 also receives the pixel / subpixel values of the candidate test region of better coincidence of the reference frame identified by the motion estimation block 760. In Figure 7, it is shown that these values are passed through line 729 from the sub-pixel interpolation block on demand 750. In alternative embodiments of the invention, the pixel values in question are provided from the motion estimation block 760 itself.
By using the approximate representation of the motion vector generated by the motion field coding block 770 and the pixel / subpixel values of the best matching candidate test region, the motion compensated prediction block 780 produces a macroblock of predicted pixel values. The predicted pixel macroblock represents a prediction of the pixel values of the current macroblock generated from the interpolated reference frame. The macroblock of predicted pixel values is passed to combiner 716 where it is subtracted from the current new frame to produce prediction error information 723 from the macroblock, as described above.
The movement coefficients 724 formed by the motion field coding block are also passed to the MUX / DMUX 790 unit, in which they are combined with prediction error information 723 of the macroblock in question and possible control information from the control manager 720 to form a stream of coded video 725 for transmission to a receiving terminal.
Next, the operation of a video decoder 800 according to the invention is described. Referring to Figure 8, the decoder 800 comprises a demultiplexer unit (MUX / DMUX) 810, which receives the encoded video stream 725 of the encoder 700 and demultiplexes, a reverse quantizer 820, a reverse DCT transformer 830, a block of compensated prediction in motion 840, a frame store 850, a combiner 860, a control manager 870, an output 880, an interpolation block of sub-pixel values in advance 845 and a sub-pixel interpolation block on demand 890 associated with the compensated prediction block in motion 840. In practice, the control administrator 870 of the encoder 800 and the control administrator 720 of the encoder 700 may be the same processor. This may be the case if the encoder 700 and the decoder 800 are part of the same video codec.
Figure 8 shows an embodiment in which a combination of interpolation of sub-pixel values is used beforehand and on demand in the decoder. In other embodiments, only interpolation of values of
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
subpixels in advance, in which case the decoder 800 does not include the interpolation block of values of
on-demand subpixels 809. In a preferred embodiment of the invention, interpolation of values of
subpixels in advance in the decoder and, therefore, the interpolation block of subpixel values of
in advance 845 is omitted from the decoder architecture. If both interpolation of values of
Subpixels in advance as on demand, the decoder comprises blocks 845 and 990.
Control administrator 870 controls the operation of decoder 800 in response to whether an INTRA frame or an INTER frame is decoded. An INTRA / INTER activation control signal, which causes the decoder to switch between decoding modes is derived, for example, from information of the type of image provided in the header portion of each frame of compressed video received from the encoder . The INTRA / INTER activation control signal is passed to control manager 870 via control line 815, together with other video codec control signals demultiplexed from the encoded video stream 725 by the MUX / DMUX 810 unit.
When an INTRA frame is decoded, the 725 encoded video stream is demultiplexed into encoded INTRA macroblocks and control information. Motion vectors are not included in the 725 encoded video stream for an encoded INTRA frame. The decoding process is performed macroblock to macroblock. When the coded information 723 for a macroblock is extracted from the video stream 725 by the MUX / DMUX 810 unit, it is passed to the inverse quantizer 820. The control administrator controls the inverse quantizer 820 to apply an adequate level of inverse quantization to the coded information macroblock, according to control information provided in the video stream 725. The inverse quantized macroblock is then inversely transformed into the reverse DCT transformer 830 to form a decoded block of image information. The 870 control manager controls the combiner 860 to prevent any reference information from being used in the encoded INTRA macroblock decoding. The decoded block of image information is passed to video output 880 of the decoder.
In embodiments of the decoder that employ interpolation of sub-pixel values in advance, the decoded block of image information (i.e., pixel values) produced as inverse quantization and inverse transform operations performed in blocks 820 and 830 is passed to the block of interpolation in advance 845. Here, the interpolation of sub-pixel values is performed in accordance with the method of the invention, determining the degree of interpolation of sub-pixel values in advance applied by the implementation details of the decoder. In embodiments of the invention in which interpolation of sub-pixel values on demand is not performed, the interpolation block of sub-pixel values in advance 845 interpolates all sub-pixel values. In embodiments that use a combination of interpolation of sub-pixel values in advance and on demand, the interpolation block of sub-pixel values in advance 845 interpolates a certain subset of sub-pixel values. This may comprise, for example, all subpixels at half-pixel positions, or a combination of subpixels at half-pixel and quarter-pixel positions. In any case, after interpolation of sub-pixel values in advance, interpolated sub-pixel values are stored in frame store 850, together with the original decoded pixel values. As the subsequent macroblocks are decoded, interpolated beforehand and stored, a decoded frame, at least partially interpolated at a sub-pixel resolution is progressively assembled in the frame store 850 and becomes available for use as a reference frame for Compensated prediction in motion.
In embodiments of the decoder that do not employ interpolation of sub-pixel values in advance, the decoded block of image information (i.e., pixel values) produced as a result of the inverse quantization and inverse transformation operations performed on the macroblock in blocks 820 and 830 is passed directly to frame store 850. As subsequent macroblocks are decoded and stored, a decoded frame, which has a unit pixel resolution is progressively assembled in frame store 850 and becomes available for use as a reference frame for compensated prediction in motion.
When an INTER frame is decoded, the encoded video stream 725 is demultiplexed into predicted error information coded 723 for each macroblock in the frame, associated motion coefficients 724 and control information. Again, the decoding process is performed macroblock to macroblock. When the predicted error information coded 723 for a macroblock is extracted from the video stream 725 by the MUX / DMUX 810 unit, it is passed to the inverse quantizer 820. The control administrator 870 controls the inverse quantizer 820 to apply a adequate level of inverse quantization to the coded prediction error information macroblock, according to the control information received in the 725 video stream. The inverse quantized prediction error information macroblock is then inversely transformed into the reverse DCT transformer 830 to produce decoded prediction error information for the macroblock.
The motion coefficients 724 associated with the macroblock in question are extracted from the video stream 725 by the MUX / DMUX 810 unit and are passed to the motion compensated prediction block 840, which reconstructs a motion vector for the macroblock using the same movement model as used
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
to encode the INTER macroblock encoded in the encoder 700. The reconstructed motion vector approximates the motion vector originally determined by the motion estimation block 760 of the encoder. The motion compensated prediction block 840 of the decoder uses the reconstructed motion vector to identify the position of a block of pixel / subpixel values in a prediction reference frame stored in frame store 850. The reference frame can be , for example, a previously decoded INTRA frame, or a previously decoded INTER frame. In any case, the block of pixel / subpixel values indicated by the reconstructed motion vector represents the prediction of the macroblock in question.
The reconstructed motion vector can point to any pixel or subpixel. If the motion vector indicates that the prediction for the current macroblock is formed from pixel values (i.e., pixel values at unit pixel positions), they can simply be retrieved from frame store 850, since the values in question are obtained directly during the decoding of each frame. If the motion vector indicates that the prediction for the current macroblock is formed from sub-pixel values, these must be retrieved from frame store 850, or calculated on the sub-pixel interpolation block on demand 890. Whichever sub-pixel values They must be calculated, or they can simply be retrieved from the frame store, depending on the degree of interpolation of sub-pixel values in advance used in the decoder.
In embodiments of the decoder that do not employ interpolation of sub-pixel values in advance, the required sub-pixel values are all calculated in the interpolation block of sub-pixel values on demand 890. On the other hand, in embodiments in which all sub-pixel values interpolated in advance, the motion compensated prediction block 840 can retrieve the required sub-pixel values directly from the frame store 850. In embodiments using a combination of interpolation of sub-pixel values in advance and on demand, the action required to obtain the Required subpixel values depend on which subpixel values are interpolated beforehand. Taking as an example an embodiment in which all sub-pixel values in half-pixel positions are calculated beforehand, it is evident that if a reconstructed motion vector for a macroblock points to a pixel in a unit position or a sub-pixel in a position of half pixel, all the pixel or subpixel values required to form the prediction for the macroblock are present in the frame store 850 and can be recovered therefrom by the motion compensated prediction block 840. However, if the motion vector indicates a sub-pixel at a quarter-pixel position, the pixels required to form the prediction for the macroblock are not present in the frame store 850 and, therefore, are calculated in the interpolation block of sub-pixel values on demand 890. In this case, the interpolation block of sub-pixel values on demand 890 retrieves any pixel or sub-pixel re intended to perform interpolation of frame store 850 and apply the interpolation method described below. The sub-pixel values calculated in the interpolation block of sub-pixel values on demand 890 are passed to the compensated prediction block in motion 840.
Once a prediction for a macroblock has been obtained, the prediction (i.e., a macroblock of predicted pixel values) is passed from the motion compensated prediction block 840 to the combiner 860, where it is combined with the information of Decoded prediction error for the macroblock to form a reconstructed image block which, in turn, is passed to the 880 video output of the decoder.
It should be appreciated that in practical implementations of encoder 700 and decoder 800, the degree to which the frames are interpolated subpixel values in advance, and therefore the amount of interpolation of subpixel values on demand that is performed can be chosen according to or be determined by the hardware implementation of the 700 video encoder, or the environment in which it is to be used. For example, if the memory available for the video encoder is limited, or memory must be reserved for other functions, it is appropriate to limit the amount of interpolation of sub-pixel values beforehand. In other cases, where the microprocessor that performs the video encoding operation has a limited processing capacity, for example, the number of operations per second that can be executed is comparatively low, it is more appropriate to restrict the amount of interpolation of values of On-demand subpixels that are performed. In a mobile communications environment, for example, when video encoding and decoding functionality is incorporated into a mobile phone or similar wireless terminal for communication with a mobile telephone network, both memory and processing power can be limited In this case, a combination of interpolation of subpixel values in advance and on demand may be the best option to obtain an efficient implementation in the video encoder. In video decoder 800, the use of a sub-pixel value in advance is generally not preferred, since it usually results in the calculation of many sub-pixel values that are not really used in the decoding process. However, it should be appreciated that, although different amounts of interpolation can be used beforehand and on demand in the encoder and decoder to optimize the operation of each, both the encoder and the decoder can be implemented to use the same division between interpolation of Subpixel values in advance and on demand.
Although the above description does not describe the construction of bidirectionally predicted frames (frames B) in encoder 700 and decoder 800, it should be understood that in embodiments of the invention that capability can be provided. Providing that capacity is considered within the capacity of a person skilled in the art.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
An encoder 700 or a decoder 800 according to the invention can be made by using hardware or software, or by using a suitable combination of both. An encoder or decoder implemented in software can be, for example, a separate program or a software building block that can be used by several programs. In the description above and in the drawings, the functional blocks are represented as separate units, but the functionality of these blocks can be implemented, for example, in a software program unit.
The encoder 700 and decoder 800 can also be combined to form a video codec that has both encoding and decoding functionality. In addition to being implemented in a multimedia terminal, the codec can also be implemented in a network. A codec according to the invention can be a computer program or a computer program element, or it can be implemented at least partially using hardware.
Next, the subpixel interpolation method used in encoder 700 and decoder 800 according to the invention is described in detail. The method will first be introduced at a general conceptual level and then two preferred embodiments will be described. In the first preferred embodiment, the interpolation of sub-pixel values is performed at a resolution of 1/4 pixel and in the second the method is extended to a resolution of 1/8 pixel.
It should be noted that interpolation must produce identical values in the encoder and decoder, but its implementation must be optimized for both entities separately. For example, in an encoder according to the first embodiment of the invention in which the interpolation of sub-pixel values is performed at a resolution of 1/4 pixel, the most efficient is to calculate pixels of a resolution of 1/2 of beforehand and calculate values for subpixels of a resolution of 1/4 in a way on demand, only when necessary during motion estimation. This has the effect of limiting memory usage while maintaining computational complexity / load to an acceptable level. In the decoder, on the other hand, it is advantageous not to previously calculate any of the subpixels. Therefore, it is to be appreciated that a preferred embodiment of the decoder does not include the interpolation block of sub-pixel values in advance 845 and all interpolation of sub-pixel values is performed in an interpolation block of sub-pixel values on demand 890.
In the description of the interpolation method provided below, references are made to the pixel positions illustrated in Figure 14a. In this figure, the pixels marked with the letter A represent original pixels (that is, pixels that reside in horizontal and vertical positions). Pixels marked with other letters represent subpixels that must be interpolated. The following description adheres to the conventions previously introduced regarding the position description of pixels and subpixels.
Next, the steps required to interpolate all sub-pixel positions will be described:
The values for resolution 1/2 subpixels marked with the letter b are obtained by first calculating a
intermediate b value using a K-order filter, according to:
image 1
where xi is a vector of filter coefficients, Ai is a corresponding vector of original pixel values A located in horizontal unit and vertical unit positions and K is an integer that defines the order of the filter. Therefore, equation 9 can be expressed again as:
b = Xl Al + X2 A2 + X3 A3 + .... + X K-1 Ak-1 + X K A K (10)
The values of the filter coefficients xi and the order of the filter K may vary from one embodiment to another. Similarly, different coefficient values can be used in the calculation of different subpixels within an embodiment. In other embodiments, the values of filter coefficients xi and the order of the filter may depend on which of the sub-pixels of resolution b of 1/2 is being interpolated. Pixels Ai are arranged symmetrically with respect to the sub-pixel b of 1/2 resolution being interpolated and are the closest neighbors to that sub-pixel. In the case of the 1/2 resolution sub-pixel b located in a middle horizontal position and a vertical unit position, the pixels Ai are arranged horizontally with respect to b (as shown in Figure 14b). If the resolution sub-pixel b 1/2 located in a horizontal unit position and an average vertical position is being interpolated, the subpixels Ai are arranged vertically with respect to b (as shown in Figure 14c).
A final value for the resolution sub-pixel b of 1/2 is calculated by dividing the intermediate value b between a constant scale1, which truncates it to obtain an integer and trimming the result so that it is in the interval [0,
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
2n-1]. In alternative embodiments of the invention, rounding can be performed instead of truncation. Preferably, the constant scale1 is chosen to be equal to the sum of the filter coefficients xi.
A value for the resolution sub-pixel of 1/2 marked with the letter c is also obtained by calculating an intermediate value c by using an M-order filter, according to:
image2
where yi is a vector of filter coefficients, bi is a corresponding vector of intermediate values bi in horizontal or vertical direction, that is:
c = y, b, + y2 b2 + y3 b3 + .... + y m-i + y m b M (12)
The values of the filter coefficients yi and the order of the filter M may vary from one embodiment to another. Similarly, different coefficient values can be used in the calculation of different subpixels within an embodiment. Preferably, the values of b are intermediate values for sub-pixels b of 1/2 resolution that are symmetrically arranged with respect to sub-pixel c of 1/2 resolution and are the closest neighbors of sub-pixel c. In one embodiment of the invention, sub-pixels b of 1/2 resolution are arranged horizontally with respect to sub-pixel c, in an alternative embodiment, they are arranged vertically with respect to sub-pixel c.
A final resolution sub-pixel value of 1/2 is calculated by dividing the intermediate value c between a constant scale2, which truncates it to obtain an integer and trimming the result so that it is in the interval [0, 2n-1]. In alternative embodiments of the invention, rounding can be performed instead of truncation. Preferably, the constant scale2 is equal to the scale1 * scale ^
It should be noted that the use of intermediate values b in the horizontal direction leads to the same result as if intermediate values b were used in the vertical direction.
There are two alternatives to interpolate values for the 1/4 resolution subpixels marked with the letter h. Both involve linear interpolation along a diagonal line that links subpixels of resolution of 1/2 neighbors to subpixel h of resolution of 1/4 that is being interpolated. In a first embodiment, a value for sub-pixel h is calculated by averaging the values of the two resolution sub-pixels b 1/2 closest to sub-pixel h. In a second embodiment, a value for sub-pixel h is calculated by averaging the values of the nearest pixel A and the closest resolution sub-pixel c of 1/2. It should be noted that this provides the possibility of using different combinations of diagonal interpolations to determine the values for subpixels h within the confines of different groups of 4 pixels A of image. However, it is also appreciated that the same combination should be used both in the encoder and in the decoder to produce identical interpolation results. Figure 15 illustrates 4 possible diagonal interpolation choices for subpixels h in adjacent groups of 4 pixels within an image. Simulations in the TML environment have verified that both embodiments result in a similar compression efficiency. The second embodiment has greater complexity, since the calculation of sub-pixel c requires the calculation of several intermediate values. Therefore, the first embodiment is preferred.
The values for 1/4 resolution subpixels marked with the letters d and g are calculated from the values of their closest horizontal neighbors by using linear interpolation. In other words, a value for the resolution sub-pixel d of 1/4 is obtained by averaging values of its closest horizontal neighbors, the original image pixel A, and the resolution sub-pixel b of 1/2. Similarly, a value for a sub-pixel g of 1/4 resolution is obtained by taking the average of its two closest horizontal neighbors, sub-pixels b and c of 1/2 resolution.
The values for 1/4 resolution subpixels marked with the letters e, f and i are calculated from the values of their closest neighbors in the vertical direction by using linear interpolation. More specifically, a value for sub-pixel e of resolution of 1/4 is obtained by averaging the values of its two closest vertical neighbors, the original image pixel A and the resolution sub-pixel b of 1/2. Similarly, a value for the sub-pixel f of 1/4 resolution is obtained by taking the average of its two closest vertical neighbors, the sub-pixels b and c of 1/2 resolution. In one embodiment of the invention, a value for sub-pixel i of resolution of 1/4 is obtained in an identical manner to that just described in connection with sub-pixel f of resolution of 1/4. However, in an alternative embodiment of the invention, and in common with the TML5 and TML6 test models of H.26 previously described, the sub-pixel i of 1/4 resolution is determined by using the four pixel values Original image frames according to (A1 + A2 + A3 + A4 + 2) / 4.
It should also be noted that in all cases where an average is determined that implies pixel and / or sub-pixel values, the average can be formed in any appropriate way. For example, the value for the
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
sub-pixel d of 1/4 resolution can be defined as d = (A + b) / 2 or as d = (A + b + 1) / 2. The addition of 1 to the sum of values for pixel A and sub-pixel b of 1/2 resolution has the effect of causing any rounding or truncation operation subsequently applied to round or truncate the value for d to the next highest integer value . This is true for any sum of integer values and can be applied to any of the average operations performed in accordance with the method of the invention to control the effects of rounding or truncation.
It should be noted that the method of interpolation of sub-pixel values according to the invention provides advantages over each of TML5 and TML6.
Unlike TML5, in which the values of some of the 1/4 resolution subpixels depend on previously interpolated values obtained for other 1/4 resolution subpixels, and in the method according to the invention, all subpixels 1/4 resolution are calculated from original image pixels or subpixel positions of 1/2 resolution using linear interpolation. Therefore, the precision reduction of those 1/4 resolution sub-pixel values that occurs in TML5 due to truncation and intermediate trimming of the other 1/4 resolution sub-pixels from which they are calculated, not takes place in the method according to the invention. In particular, with reference to Figure 14a, the sub-pixels h of 1/4 resolution (and the sub-pixel i in an embodiment of the invention) are interpolated diagonally to reduce the dependence of the other sub-pixels of 1/4. Furthermore, in the method according to the invention, the number of calculations (and therefore the number of processor cycles) required to obtain a value for those subpixels of 1/4 resolution in the decoder are reduced compared to TML5 . In addition, the calculation of any 1/4 resolution sub-pixel value requires a number of calculations that is substantially similar to the number of calculations required to determine any other 1/4 resolution sub-pixel value. More specifically, in a situation where the required 1/2 resolution subpixel values are already available, for example, have been calculated beforehand, the number of calculations required to interpolate a resolution subpixel value of 1 / 4 from the resolution sub-pixel values of 1/2 precalculated is the same as the number of calculations required to calculate any other resolution sub-pixel value of 1/4 from the resolution sub-pixel values of 1 / 2 available.
Compared to TML6, the method according to the invention does not require high precision arithmetic for use in the calculation of all subpixels. Specifically, since all 1/4 resolution sub-pixel values are calculated from original image pixels or 1/2 resolution sub-pixel values using linear interpolation, lower precision arithmetic can be used in their interpolation. Consequently, hardware implementations of the method of the invention, for example, in an Application Specific Integrated Circuit (ASIC), the use of lower precision arithmetic reduces the number of components (eg, doors) that must be dedicated to the 1/4 pixel resolution value calculation. This, in turn, reduces the overall area of silicon that should be dedicated to the interpolation function. Since most subpixels are in fact 1/4 resolution subpixels (12 of the 15 subpixels illustrated in Figure 14a), the advantage provided by the invention in this regard is particularly significant. In software implementations, where sub-pixel interpolation is performed by using the standard instruction set of a central processor unit (CPU) for general purposes or by using a Digital Signal Processor (DSP), a reduction in the Accuracy of the required arithmetic generally entails an increase in the speed at which calculations can be performed. This is particularly advantageous in "low cost" implementations, in which it is convenient to use a CPU for general purposes rather than any form of ASIC.
The method according to the invention provides additional advantages compared to TML5. As mentioned earlier, in the decoder only one of the 15 sub-pixel positions is required at any given time, namely that indicated by received motion vector information. Therefore, it is advantageous if the value of a sub-pixel in any sub-pixel position can be calculated with the minimum number of stages that result in a correctly interpolated value. The method according to the invention provides this capability. As mentioned in the detailed description above, the resolution sub-pixel c of 1/2 can be interpolated by filtering either in the vertical or horizontal direction of the same value that is obtained for c regardless of which horizontal or vertical filter is used. The decoder can therefore take advantage of this property when calculating values for sub-pixels f and g of 1/4 resolution, so as to minimize the number of operations required to obtain the required values. For example, if the decoder requires a value for sub-pixel f of resolution of 1/4, sub-pixel c of resolution of 1/2 must be interpolated in a vertical direction. If a value for the sub pixel g of 1/4 resolution is required, it is advantageous to interpolate a value for c in the horizontal direction. Therefore, in general, it can be said that the method according to the invention provides flexibility in the way in which the values for certain subpixels of 1/4 resolution are derived. No such flexibility is provided in TML5.
Next, two specific embodiments are described in detail. The first represents a preferred embodiment for calculating subpixels with a pixel resolution of up to 1/4, while, in the second, the method according to the invention is extended to the calculation of values for subpixels having a pixel resolution of up to 1/8 For both embodiments a comparison is provided between the flexibility / computational load that results from the use of the method according to the invention and from the result of the use of the methods of
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
interpolation according to TML5 and TML6 in equivalent circumstances.
The preferred embodiment for interpolating subpixels at a resolution of 1/4 pixel will be described with reference to Figures 14a, 14b and 14c. Next, it will be assumed that all image pixels and final interpolated values for subpixels are represented with 8 bits.
Calculation of resolution subpixels of 1/2 in i) horizontal position of half unit and vertical position of unit and ii) horizontal position of unit and vertical position of half unit.
1. A value for the sub-pixel in a horizontal half-unit and vertical unit position, that is, sub-pixel b of 1/2 resolution in Figure 14a, is obtained by first calculating the intermediate value b = (A1 - 5A2 + 20A3 + 20A4 - 5A5 + A6) using the values of six pixels (A1 to A6) that are located in horizontal unit and vertical unit positions either in the row or in the column of pixels containing b and which are arranged symmetrically around b, as shown in Figures 14b and 14c. A final value for sub-pixel b of 1/2 resolution is calculated as (b + 16) / 32 where the operator / indicates division with truncation. The result is trimmed so that it is in the interval [0, 255].
Calculation of resolution subpixels of 1/2 in horizontal position of half unit and vertical position of half unit.
2. A value for the sub-pixel in the horizontal position of half a unit and vertical position of half a unit, that is sub-pixel c of resolution 1/2 in Figure 14a, is calculated as c = (b1 - 5b2 + 20b3 + 20b4 - 5b5 + b6 + 512) / 1024 by using intermediate values b for the six closest 1/2 resolution subpixels that are located either in the row or column of subpixels containing c and that are symmetrically arranged around c . Again, the operator / indicates division with truncation and the result is trimmed to be in the range of [0, 255]. As explained above, the use of intermediate values b for subpixels of resolution 1/2 in the horizontal direction leads to the same result as the use of intermediate values b for subpixels b resolution 1/2 in the vertical direction. Therefore, in an encoder according to the invention, the address for interpolating sub-pixels b of 1/2 resolution can be chosen according to a preferred implementation mode. In a decoder according to the invention, the address for interpolating subpixels b is chosen according to the subpixels, if any, of resolution of 1/4 which will be interpolated by using the result obtained for subpixel c of resolution of 1 /2.
Calculation of subpixels of resolution of 1/4 in i) horizontal position of a unit room and vertical position of unit; ii) horizontal position of a quarter unit and vertical position of half unit; iii) horizontal unit position and vertical position of a unit room; and iv) horizontal position of half unit and vertical position of a unit room.
3. The values for 1/4 resolution d subpixels, located horizontally in a unit room and vertical unit position are calculated according to d = (A + b) / 2 by using image pixel A Nearest original and sub-pixel b of 1/2 resolution, closer in horizontal direction. Similarly, the values for 1/4 resolution sub-pixels g, located in a horizontal position of a quarter unit and a vertical position of half a unit are calculated according to g = (b + c) / 2 by use of the two resolution subpixels of 1/2 closest in horizontal direction. Similarly, the values for subpixels e of resolution of 1/4, located in a horizontal unit position and a vertical position of a unit room, are calculated according to e = (A + b) / 2 by use of the closest original image pixel A and the nearest 1/2 resolution sub-pixel b in the vertical direction. The values for subfixes f of resolution of 1/4, located in a horizontal position of half a unit and a vertical position of a quarter of a unit, are determined from f = (b + c) / 2 by using the two resolution subpixels of 1/2 closer in vertical direction. In all cases, the operator / indicates division with truncation.
Calculation of 1/4 resolution subpixels in a horizontal position of a unit room and a vertical position of a unit room.
4. The values for sub-pixels h of resolution of 1/4 located in a horizontal position of a unit room and a vertical position of a unit room are calculated according to h = (b1 + b2) / 2, by use of the two sub-pixels b of resolution of 1/2 closest in diagonal direction. Again, the operator / indicates division with truncation.
5. A value for the 1/4 resolution sub-pixel marked with the letter i is calculated from i = (A1 + A2 + A3 + A4 + 2) / 4 by using the four closest original A sub-pixels. Again, the operator / indicates division with truncation.
Next, an analysis of the computational complexity of the first preferred embodiment of the invention is presented.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
In the encoder, the same sub-pixel values are likely to be calculated multiple times. Therefore, and as explained above, the complexity of the encoder can be reduced by previously calculating all sub-pixel values and storing them in memory. However, this solution increases memory usage by a large margin. In a preferred embodiment of the invention, in which the precision of motion vector is a resolution of 1/4 pixel in both the horizontal and vertical dimensions, storing sub-pixel values previously calculated for the entire image requires 16 times the memory required to store the original non-interpolated image. To reduce memory usage, all 1/2 resolution subpixels can be interpolated beforehand and 1/4 resolution subpixels can be calculated on demand, that is, only when necessary. According to the method of the invention, interpolation on demand of values for 1/4 resolution subpixels only requires linear interpolation of 1/2 resolution subpixels. Four times the original image memory is required to store sub-pixels of resolution of 1/2 pre-calculated, since only 8 bits are necessary to represent them.
However, if the same strategy is used to calculate all 1/2 resolution subpixels before using interpolation in advance together with the TML6 direct interpolation scheme, memory requirements are increased up to 9 times the memory required to store The original non interpolated image. This results from the fact that a larger number of bits is required to store the high precision intermediate values associated with each 1/2 resolution sub-pixel in TML6. In addition, the complexity of sub-pixel interpolation during motion estimation is greater in TML6, since scaling and trimming must be performed for each 1/2 and 1/4 sub-pixel position.
Next, the complexity of the interpolation method of sub-pixel values according to the invention is compared, when applied in a video decoder, with that of the interpolation schemes used in TML5 and TML6. Throughout the analysis that follows, it is assumed that in each method the interpolation of any sub-pixel value is performed using only the minimum number of stages required to obtain a correctly interpolated value. In addition, it is assumed that each method is implemented in a block-based manner, that is, common intermediate values for all subpixels to be interpolated in a particular N x M block are calculated only once. An illustrative example is provided in Figure 16. With reference to Figure 16, it can be seen that to calculate a 4x4 block of subpixels c of 1/2 resolution first a 9x4 block of subpixels b of 1/2 resolution.
Compared to the TML5 subpixel value interpolation method, the method according to the invention has a lower computational complexity for the following reasons:
1. Unlike the interpolation scheme of sub-pixel values used in TML5, according to the method of the invention, a value for a sub-pixel c of 1/2 resolution can be obtained by filtering either in the vertical or horizontal direction . Therefore, to reduce the number of operations, the resolution sub-pixel c of 1/2 can be interpolated in the vertical direction if a value for a sub-pixel resolution f of 1/4 is required and in the horizontal direction if a value is required for a pixel g of sub-pixel resolution of 1/4. As an example, Figure 17 shows all 1/2 resolution sub-pixel values that must be calculated to interpolate values for 1/4-resolution g sub-pixels into an image block defined by 4 x original image pixels 4 by using the interpolation method of TML5 (Figure 17a) and by using the method according to the invention (Figure 17b). In this example, the method of interpolation of subpixel values according to TML5 requires a total of 88 subpixels of 1/2 resolution to be interpolated, while the method according to the invention requires the calculation of 72 subpixels of resolution of 1/2. As can be seen from Figure 17b, according to the invention, sub-pixels c of resolution 1/2 are interpolated in a horizontal direction to reduce the number of calculations required.
2. According to the method of the invention, the sub-pixel h with a resolution of 1/4 is calculated by linear interpolation from its two resolution sub-pixels of 1/2 neighbors closer in a diagonal direction. The respective numbers of 1/2 resolution subpixels that must be calculated to obtain values for 1/4 resolution h subpixels within a 4x4 pixel block of the original image using the subpixel pixel interpolation method of according to TML5 and the method according to the invention is shown in Figures 18 (a) and 18 (b), respectively. When using the method according to HTML5, it is necessary to interpolate a total of 56 subpixels of 1/2 resolution, while, according to the method of the invention, it is necessary to interpolate 40 subpixels of 1/2 resolution.
Table 1 summarizes the decoder complexities of the three interpolation methods of sub-pixel values considered here, in accordance with TML5, the direct interpolation method used in TML6 and the method according to the invention. Complexity is measured in terms of the number of six-branch and linear filter interpolation operations performed. It is assumed that the sub-pixel interpolation i of 1/4 resolution is calculated according to i = (A1 + A2 + A3 + A4 + 2) / 4 which is a bilinear interpolation and effectively comprises two linear interpolation operations. The operations required to interpolate the sub-pixel values with a 4x4 block of the original image pixels are listed for each of the 15 sub-pixel positions, for reference convenience, are numbered according to the scheme shown in the figure 19. With reference to Figure 19, position 1 is the position in an original image pixel A and positions 2 to 16 are positions of
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
subpixels Position 16 is the sub-pixel position of 1/4 resolution. To calculate the average number of operations, it has been assumed that the probability that a motion vector indicates each sub-pixel position is the same. The average complexity is therefore the average of the 15 sums calculated for each sub-pixel position and the individual full pixel position.
Table 1: 1/4 sub resolution interpolation complexity of TML5, TML6 and the method according to
the invention
 TML5 TML6 Method of the invention
 Position  linear 6 branches Linear 6 branches Linear 6 branches
 one  0 0 0 0 0 0
 3.9  0 16 0 16 0 16
 2,4,5,13  16 16 0 16 16 16
 eleven  0 52 0 52 0 52
 7.15  16 52 0 52 16 52
 10.12  16 68 0 52 16 52
 6,8,14  48 68 0 52 16 32
 16  32 0 32 0 32 0
 Average  19 37 2 32 13 28.25
It can be seen from Table 1 that the method according to the invention requires fewer six-branch filter operations than the interpolation method of sub-pixel values according to TML6 and only a few additional linear interpolation operations. Since the six-branch filter operations are much more complex than the linear interpolation operations, the complexity of the two methods is similar. The interpolation method of sub-pixel values according to TML5 has considerably higher complexity.
Next, the preferred embodiment for interpolating subpixels is described up to a pixel resolution of 1/8 will be described with reference to Figures 20, 21 and 22.
194 Figure 20 presents the nomenclature used to describe pixels, 1/2 resolution subpixels, 1/4 resolution subpixels and 1/8 resolution subpixels in this application extended to the method according to the invention.
1. The values for subpixels of resolution 1/2 and resolution 1/4 marked with the letters b1, b2 and b3 in Figure 20 are obtained by first calculating the intermediate values b1 = (-3A1 + 12A2 - 37A3 + 229A4 + 71A5 + 21A6 + 6A7 - As); b2 = (-3A1 + 12A2 - 39A3 + 158A4 + 158A5 - 39A6 + 12A7 - 3AS); and b3 = (-A1 + 6A2 - 21A3 + 71A4 + 229A5 - 37A6 + 13A7 - 3As); using the values of the eight closest image pixels (A1 to As) located in a horizontal unit position and a vertical unit position either in the row or column containing b1, b2 and b3 and symmetrically arranged around the sub-pixel 1/2 resolution b2. The asymmetries in the filter coefficients used to obtain intermediate values b1 and b3 explain the fact that pixels A1 to As are not located symmetrically with respect to subpixels b1 and b3 of 1/4 resolution. The final values for the subpixels bi, i = 1, 2, 3 are calculated according to bi = (b '+ 128) / 256 where the operator / indicates division with truncation. The result is trimmed so that it is in the interval [0, 255].
2. The values for the subpixels of 1/2 resolution and 1/4 resolution marked with cij, i, j = 1, 2, 3, are calculated according to
c1j = (-3b1 + 12b2 - 37b3 + 229b 4 + 71 b5 - 21 b6j + 6b7 - bei + 32768) / 65536, c2j = (-3b1 + 12b2 - 39b3 + 158b4 + 158bZ - 39b6 + 12b / - 3b8 + 32768 ) / 65536
Y
c3j = (-b / + 6b2 - 21 b3j + 71b / - 229b / - 37b6 + 13b / - 3b8 + 32768) / 65536
using the intermediate values b1, b2 and b3 calculated for the eight closest subpixels (b1j to bsj) in
vertical direction, the bj subpixels are located in the column comprising the 1/2 resolution and 1/4 resolution cij subpixels that are being interpolated and are symmetrically arranged around the 1/2 resolution sub pixel c2j. The asymmetries in the filter coefficients used to obtain values for subpixels c1j and c3j explain the fact that subpixels b1j to bsj are not symmetrically located with respect to subpixels c1j and c3j of 1/4 resolution. Again, the operator / indicates division with truncation. Before the interpolated values for the subpixels are stored in the frame memory, they are trimmed to remain in the interval [0, 255]. In an alternative embodiment of the invention, the sub-pixels cij of 1/2 resolution and 1/4 resolution are calculated using the intermediate values b1, b2 and b3 in a similar manner in the horizontal direction.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
3. Values for 1/8 resolution subpixels marked with the letter d are calculated using linear interpolation from the values of your nearest neighboring image pixel, 1/2 resolution subpixels or 1/4 resolution resolution. horizontal or vertical direction. For example, the sub-pixel d of resolution of 1/8 that is above left-most is calculated according to d = (A + b1 + 1) / 2. As before, the operator / indicates division with truncation.
4. The values for 1/8 resolution subpixels marked with the letters e and f are calculated using linear interpolation from the image pixel values, 1/2 resolution subpixels or 1/4 resolution resolution in diagonal direction . For example, with reference to Figure 20, the resolution sub-pixel 1/8 above left is calculated according to e = (b1 + b1 + 1) / 2. The diagonal direction to be used in the interpolation of each 1/8 resolution sub-pixel in a first embodiment of the invention, hereafter referred to as "preferred method 1", is indicated in Figure 21 (a). The values for 1/8 resolution subpixels marked with the letter g are calculated according to g = (A + 3c22 + 3) / 4. As always, the operator / indicates division with truncation. In an alternative embodiment of the invention, hereinafter referred to as "preferred method 2", computational complexity is further reduced by interpolation of subpixels f of resolution of 1/8 by using interpolation from subpixels b2 of resolution of 1 / 8, that is, according to the relation f = (3b2 + b2 + 2) / 4. Subpixel b2 that is closer to f is multiplied by 3. The diagonal interpolation scheme used in this alternative embodiment of the invention is illustrated in Figure 21 (b). In additional alternative embodiments, different diagonal interpolation schemes can be contemplated.
It should be noted that in all cases where an average is used that implies pixel and / or subpixel values in the determination of 1/8 resolution subpixels, the average can be formed in any appropriate way. The addition of 1 to the sum of values used to calculate an average has the effect of having any rounding or truncation operation applied subsequently to round or truncate the average in question to the next highest integer value. In alternative embodiments of the invention, the addition of 1 is not used.
As in the case of interpolation of sub-pixel values at a resolution of 1/4 pixel previously described, the memory requirements in the encoder can be reduced by previously calculating only a part of the sub-pixel values to be interpolated. In the case of interpolation of sub-pixel values at a pixel resolution of 1/8, it is advantageous to calculate all sub-pixels of 1/2 resolution and 1/4 resolution in advance and calculate sub-pixel resolution values of 1 / 8 in a manner on demand, only when required. When this approach is adopted, both the interpolation method according to TML5 and the one according to the invention require 16 times the original image memory to store the sub-pixel values of 1/2 resolution and 1/4 resolution. However, if the direct interpolation method according to TML6 is used in the same way, the intermediate values for the sub-pixels of 1/2 resolution and 1/4 resolution sub-pixels must be stored. These intermediate values are represented with a precision of 32 bits and this results in a memory requirement 64 times that of the original non-interpolated image.
Next, the complexity of the interpolation method of sub-pixel values according to the invention, when applied in a video decoder to calculate values for sub-pixels at a pixel resolution of up to 1/8, is compared with that of the schemes interpolation used in TML5 and TML6. As in the equivalent analysis for interpolation of sub-pixel values of 1/4 pixel resolution described above, it is assumed that in each method the interpolation of any sub-pixel value is performed using only the minimum number of stages required to obtain a value properly interpolated. It is also assumed that each method is implemented in a block-based manner, such that the common intermediate values for all subpixels that are to be interpolated in a particular N x M block are calculated only once.
Table 2 summarizes the complexities of the three interpolation methods. Complexity is measured in terms of the 8-branch filter number and the linear interpolation operations performed in each method. The table presents the number of operations required to interpolate each of the 63 subpixels of 1/8 resolution with a block of 4 x 4 pixels of original image, each subpixel position is identified with a corresponding number, as illustrated in the Figure 22. In Figure 22, position 1 is the position of an original image pixel and positions 2 to 64 are sub-pixel positions. When calculating the average number of operations, it has been assumed that the probability that a motion vector that points to each sub-pixel position is the same. The average complexity is therefore the average of the 63 sums calculated for each sub-pixel position and the individual full pixel position.
Table 2: Interpolation complexity of 1/8 resolution subpixels in TML5, TML6 and the method according to the invention. (The results are shown separately for preferred method 1 and preferred method 2)
 TML5 TML6 Preferred method 1 Preferred method 2
 Position  Linear | 8 branches Linear | 8 branches Linear | 8 Linear | 8 branches
 branches
 one  0 0 0 0 0 0 0 0
 3,5,7,17,33,49  0 16 0 16 0 16 0 16
 19,21,23,35,37,39, 51,53,55  0 60 0 60 0 60 0 60
 2,8,9,57  16 16 0 16 16 16 16 16
 4,6,25,41  16 32 0 16 16 32 16 32
 10,16,58.64  32 76 0 60 16 32 16 32
 11,13,15,59,61,63  16 60 0 60 16 60 16 60
 18,24,34,40,50,56  16 76 0 60 16 60 16 60
 12,14,60,62  32 120 0 60 16 32 16 32
 26.32.42.48  32 108 0 60 16 32 16 32
 20,22,36,38,52,54  16 120 0 60 16 76 16 76
 27,29,31,43,45,47  16 76 0 60 16 76 16 76
 28,30,44.46  32 152 0 60 16 60 16 60
 Average  64 290.25 0 197.75 48 214.75 48 192.75
As can be seen in Table 2, the number of 8 branch filter operations performed according to preferred methods 1 and 2 are, respectively, 26% and 34% less than the number of 8 branch filtration operations. performed in the interpolation method of TML5 subpixel values. The number of 5 linear operations is 25% lower, both in the preferred method 1 and in the preferred method 2, compared to TML5, but this improvement is of minor importance compared to the reduction of the 8-branch filtration operations . It can also be seen that the direct interpolation method used in TML6 has a complexity comparable to that of preferred methods 1 and 2 when used to interpolate values for 1/8 resolution subpixels.
10
权利要求:
Claims (19)
[1]
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
1. A method of interpolating an image, the image comprising pixels (A) arranged in rows and columns and represented by values having a specified dynamic range, the pixels (A) of the rows residing in horizontal unit positions and residing the pixels (A) of the columns in vertical unit positions, to generate values for subpixels (b, c, d, e, f, g, h, i), with a sub pixel being placed (b, c, d, e, f , g, h, i) in at least one of a fractional horizontal position and a fractional vertical, being able to represent fractional horizontal and fractional horizontal positions according to mathematical notation 1 / 2X, where x is a positive integer in the interval from 1 to N, 1/2 representing a particular level of subpixel interpolation and N representing a maximum level of subpixel interpolation, the method comprising:
a) interpolate values for subpixels (b) located in horizontal 1 / 2N-1 unit and vertical unit positions, and for subpixels (b) located in horizontal unit and vertical 1 / 2N-1 unit positions using weighted sums of pixels (A) that reside in respective horizontal unit and vertical unit positions;
b) interpolate a value for a sub-pixel (c) located in a horizontal position 1/2 'of unit and vertical 1/2' of unit using either a first weighted sum of the interpolated values for subpixels (b) that reside in positions 1 / 2N-1 horizontal unit and vertical unit or a second weighted sum of the interpolated values for subpixels (b) that reside in horizontal unit positions and vertical unit 1 / 2N-1 obtained in stage a); and
c) interpolate a value for a sub-pixel (h, i) located in a horizontal position 1 / 2N of unit and vertical 1 / 2N of unit using a weighted average of the value of a first sub-pixel (b) located in a horizontal position 1 / 2N-m of unit and vertical 1 / 2N-n of unit and the value of a second sub-pixel (b) located in a horizontal position 1 / 2N-p of unit and vertical 1 / 2N-q of unit, taking the variables m , n, p and q integer values in the range of 1 to N, such that the respective first and second subpixels (b) are diagonally relative to the subpixel (h, i) in the horizontal 1 / 2N unit and vertical position 1 / 2N of unit being interpolated,
wherein in step a), interpolation of a sub-pixel value using a weighted sum implies the calculation of an intermediate value for the sub-pixel value, said intermediate value having a dynamic range greater than the specified dynamic range, and the calculation of a final value for the sub-pixel value by dividing the intermediate value by a scale factor that has a value equal to the sum of the respective weights used in the weighted sum of stage a), rounding to obtain an integer and trimming the result, thus forming a sub-pixel value with a dynamic range equal to the specified dynamic range, and in which the intermediate values calculated in stage a) are used as the interpolated values in the respective weighted sums of stage b) and the final values calculated in step a) are used as the values of the first sub-pixel (b) and the second sub-pixel (b) when the value for the sub-pixel is interpolated xel located in a horizontal 1 / 2n unit position and a vertical 1 / 2N unit position in step c).
[2]
2. A method according to claim 1, wherein a first and a second weighting are used in the weighted average referred to in step c), the relative magnitudes of the weights being proportional to the proximity in the diagonal straight line of the sub-pixel (h, i) in horizontal position 1 / 2N of unit and vertical 1 / 2N of unit with respect to the respective first and second sub-pixels (b) used in step c).
[3]
3. A method according to claim 2, wherein in a situation where the respective first and second subpixels (b) used in step c) are located symmetrically with respect to the subpixel (h, i) in the horizontal position 1 / 2N unit and vertical 1 / 2N unit being interpolated, with the first and second weights having equal values.
[4]
4. A method according to claim 1, wherein the first weighted sum of the interpolated values for subpixels (b) residing in a unit 1 / 2N-1 horizontal and unit vertical position in step b) is used when interpolating a value for a sub-pixel (f) in the horizontal position 1 / 2N-1 of unit and vertical 1 / 2N of unit, and in which the second weighted sum of interpolated values for sub-pixels (b) residing in the horizontal unit and vertical 1 / 2N-1 unit positions in stage b) it is used when a value for a sub-pixel (g) is interpolated in the horizontal unit 1 / 2N and vertical 1 / 2N-1 position of unity.
[5]
5. A method according to claim 1, comprising interpolating values for subpixels (h, i) in horizontal positions 1/2 unit and vertical unit 1/2 N taking an average of the value of a sub-pixel (b) located at a horizontal 1 / 2N-1 unit and vertical unit position and the value of a sub-pixel (b) located in a horizontal unit and vertical 1 / 2N-1 unit position.
[6]
6. A method according to claim 1, wherein N is the integer value 2.
[7]
7. A method according to claim 1, wherein the specified dynamic range corresponds to the range of values that said pixels (A) can take.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
[8]
8. An interpolator (730, 750, 845, 890) to interpolate an image, the image comprising pixels (A) arranged in rows and columns and represented by values having a specified dynamic range, the pixels (A) of the rows residing in horizontal unit positions and the pixels (A) of the columns residing in vertical unit positions, the interpolator (730, 750, 845, 890) being adapted to generate values for subpixels (b, c, d, e, f, g, h, i), being a sub-pixel (b, c, d, e, f, g, h, i) located in at least one of a fractional horizontal position and a fractional vertical, being able to represent the fractional and horizontal vertical positions fractional according to the mathematical notation 1 / 2X, where x is a positive integer in the range of 1 to N, 1 / 2X representing a particular level of interpolation of subpixels and N representing a maximum level of interpolation of subpixels, the interpolator (7 30, 750, 845, 890) adapted for:
a) interpolate values for subpixels (b) located in horizontal 1 / 2N-1 unit and vertical unit positions, and for subpixels (b) located in horizontal unit and vertical 1 / 2N-1 unit positions using weighted sums of pixels (A) that reside in respective horizontal unit and vertical unit positions;
b) interpolate a value for a sub-pixel (c) located in a horizontal position 1/2 'of unit and vertical 1/2' of unit using either a first weighted sum of the interpolated values for subpixels (b) that reside in positions 1 / 2N-1 horizontal unit and vertical unit or a second weighted sum of the interpolated values for subpixels (b) that reside in the horizontal unit and vertical 1 / 2N-1 unit positions obtained in step a); and
c) interpolate a value for a sub-pixel (h, i) located in a horizontal position 1/2 unit and vertical unit 1/2 using, either:
- a weighted average of the value of a first sub-pixel (b) located in a horizontal position 1 / 2N - m of unit and vertical 1 / 2N-n of unit and the value of a second sub-pixel (b) located in a horizontal position 1 / 2N-p of unit and vertical 1 / 2N-q of unit, or
- a weighted average of the value of a pixel (A) located in a vertical unit and horizontal position of unit and the value of a sub-pixel (c) located in a horizontal position 1 / 2N-m of unit and vertical 1 / 2N- unit number,
taking the variables m, n, p and q integer values in the range of 1 to N, such that the respective first and second subpixels (b) or the respective pixel (A) and subpixel (c) are diagonally with respect to the subpixel ( h, i) in the horizontal position 1 / 2N of unit and vertical 1 / 2N of unit that is being interpolated.
[9]
9. An interpolator (730, 750, 845, 890) according to claim 8, wherein interpolator (730, 750, 845, 890) is configured to use a first and second weighting at the weighted average at which reference is made in step c), the relative magnitudes of the weights being proportional to the proximity in the diagonal straight line of the sub-pixel (h, i) in horizontal position 1 / 2N of unit and vertical 1 / 2N of unit with respect to the respective first and second subpixels (b) or the respective pixels (A) and subpixels (c) used in step c).
[10]
10. An interpolator (730, 750, 845, 890) according to claim 9, wherein the interpolator (730, 750, 845, 890) is configured to use a first and second weights having equal values in a situation where the respective first and second subpixels (b) or the respective pixels (A) and subpixels (c) used in step c) are placed symmetrically with respect to the subpixel (h, i) in a horizontal position 1 / 2n of unit and vertical 1 / 2N of unit being interpolated.
[11]
11. An interpolator (730, 750, 845, 890) according to claim 8, wherein the interpolator (730, 750, 845, 890) is configured to use the first weighted sum of the interpolated values for subpixels (b ) residing in unit 1 / 2N-1 horizontal and unit vertical positions in stage b) when a value for a sub-pixel (f) is interpolated in the unit 1 / 2N-1 horizontal position and 1 / 2N vertical of unit, and is configured to use the second weighted sum of the interpolated values for subpixels (b) that reside in the horizontal unit and vertical 1 / 2N-1 unit positions in step b) when a value is interpolated for a subpixel (g) in the horizontal position 1 / 2N of unit and vertical 1 / 2N-1 of unit.
[12]
12. An interpolator (730, 750, 845, 890) according to claim 8, wherein the interpolator (730, 750, 845, 890) is configured to interpolate values for subpixels (h, i) in a horizontal position 1 / 2N of unit and vertical 1 / 2N of unit taking an average of the value of a pixel (A) located in a horizontal position of unit and vertical of unit and the value of a sub-pixel (c) located in a horizontal position 1 / 2N-1 unit and vertical 1 / 2N-1 unit.
[13]
13. An interpolator (730, 750, 845, 890) according to claim 8, wherein the interpolator (730, 750, 845, 890) is configured to interpolate values for subpixels (h, i) in a horizontal position 1 / 2N of unit and vertical 1 / 2N of unit taking an average of the value of a sub-pixel (b) located in a horizontal position 1 / 2N-1 of unit and vertical of unit, and the value of a sub-pixel (b) located in a horizontal unit position and vertical 1 / 2N-1 unit.
5
10
fifteen
twenty
25
30
35
40
Four. Five
fifty
55
60
65
[14]
14. An interpolator (730, 750, 845, 890) according to claim 8, wherein N is one of the integer values 2, 3 and 4.
[15]
15. An interpolator (730, 750, 845, 890) according to claim 8,
in which the interpolator (730, 750, 845, 890) is configured to interpolate the value for the sub-pixel (h, i) located in a horizontal position 1/2 unit and vertical unit 1/2 in stage c) using the weighted average of the value of the first sub-pixel (b) located in a horizontal position 1 / 2N-m of unit and vertical 1 / 2N-n of unit and the value of the second sub-pixel (b) located in a horizontal position 1 / 2N -p of unit and vertical 1 / 2N-q of unit, in which in step a), the interpolator (730, 750, 845, 890) is configured to interpolate a sub-pixel value using a weighted sum that involves the calculation of an intermediate value for the sub-pixel value, said intermediate value having a dynamic range greater than the specified dynamic range, and the calculation of a final value for the sub-pixel value by dividing the intermediate value by a scale factor that has an equal value to the sum of the respective weights used in the weighted sum gives step a), by rounding to obtain an integer and trim the result, thus forming a sub-pixel value with a dynamic range equal to the specified dynamic range,
Y
in which the interpolator (730, 750, 845, 890) is configured to use intermediate values calculated in the stage
a) as the interpolated values in the respective weighted sums of stage b) and is configured to use the final values calculated in stage a) as the values of the first sub-pixel (b) and the second sub-pixel (b) when interpolating the value for the sub-pixel located in a horizontal position 1 / 2N of unit and vertical 1 / 2N of unit in step c).
[16]
16. An interpolator (730, 750, 845, 890) according to claim 8, wherein the specified dynamic range corresponds to the range of values that said pixels (A) can take.
[17]
17. A video encoder (700), a video decoder (800) or a video codec (700, 800) comprising an interpolator (730, 750, 845, 890) according to any one of claims 8 to 16 .
[18]
18. A communications terminal (60, MS) comprising a video encoder, a video decoder or a video codec according to claim 17.
[19]
19. A computer program for interpolating an image, the image comprising pixels (A) arranged in rows and columns and represented by values having a specified dynamic range, the pixels (A) of the rows residing in horizontal unit positions and residing the pixels (A) of the columns in vertical unit positions, to generate values for subpixels (b, c, d, e, f, g, h, i), with a sub pixel being placed (b, c, d, e, f , g, h, i) in at least one of a fractional horizontal position and a fractional vertical, being able to represent fractional horizontal and fractional horizontal positions according to mathematical notation 1 / 2X, where x is a positive integer in the interval 1 to N, 1 / 2X representing a particular level of sub-pixel interpolation and N representing a maximum level of sub-pixel interpolation, the computer program comprising:
a) a program code for interpolating values for subpixels (b) located in unit 1 / 2N-1 horizontal and unit vertical positions, and for subpixels (b) located in unit 1 / 2n-1 horizontal and vertical positions of unit using weighted sums of pixels (A) that reside in respective horizontal unit and vertical unit positions;
b) a program code to interpolate a value for a sub-pixel (c) located in a horizontal 1 / 2N-1 unit and vertical 1 / 2N-1 unit position using either a first weighted sum of the interpolated values for subpixels (b) residing in horizontal unit 1 / 2N-1 and vertical unit positions or a second weighted sum of interpolated values for subpixels (b) residing in horizontal unit and vertical unit 1 / 2N-1 positions obtained in stage a); Y
c) a program code to interpolate a value for a sub-pixel (h, i) located in a horizontal position 1/2 unit and vertical unit 1/2 using:
- a weighted average of the value of a first sub-pixel (b) located in a horizontal position 1 / 2N-m of unit and vertical 1 / 2N-n of unit and the value of a second sub-pixel (b) located in a horizontal position 1 / 2N-p of unit and vertical 1 / 2N-q of unit, or
- a weighted average of the value of a pixel (A) located in a horizontal unit and vertical position of unit and the value of a sub-pixel (c) located in a horizontal position 1 / 2N-m of unit and vertical 1 / 2N- unit number,
taking the variables m, n, p and q integer values in the range of 1 to N, such that the respective first and second subpixels (b) or the respective pixel (A) and subpixel (c) are diagonally with respect to the subpixel ( h, i) in the horizontal position 1 / 2N of unit and vertical 1 / 2N of unit being interpolated,
d) program code to interpolate a value for a sub-pixel (f) in the horizontal position 1 / 2N-1 of unit and vertical 1 / 2N of unit; Y
e) program code to interpolate a value for a sub-pixel (g) in the horizontal position 1/2 unit and
10
fifteen
vertical 1 / 2N-1 unit;
in which the computer program is configured so that
- the first weighted sum of the interpolated values for the subpixels (b) that reside in the unit 1 / 2N-1 horizontal and unit vertical positions is used to interpolate the value for the subpixel (c) located in a horizontal position 1 / 2N-1 unit and vertical 1 / 2N-1 unit, when the value for the sub-pixel (f) is interpolated in the horizontal position 1 / 2N-1 unit and vertical 1 / 2N unit, and
- the second weighted sum of the interpolated values for the subpixels (b) that reside in the horizontal unit and vertical 1 / 2N-1 unit positions is used to interpolate the value for the subpixel (c) located in a horizontal position 1 / 2N-1 of unit and vertical 1 / 2N-1 of unit, when the value for the sub-pixel (g) is interpolated in the horizontal position 1 / 2N of unit and vertical 1 / 2N-1 of unit.
类似技术:
公开号 | 公开日 | 专利标题
ES2540583T3|2015-07-10|Method for interpolation of sub-pixel value
AU2003201069B2|2008-02-28|Coding dynamic filters
GB2379820A|2003-03-19|Interpolating values for sub-pixels
AU2007237319B2|2011-01-20|Method for sub-pixel value interpolation
BRPI0211263B1|2018-10-02|VIDEO ENCODER INTERPOLATION METHOD, VIDEO ENCODER FOR ENCODING AN IMAGE AND COMMUNICATIONS TERMINAL
同族专利:
公开号 | 公开日
CN101232622B|2011-04-13|
KR101176903B1|2012-08-30|
US6950469B2|2005-09-27|
JP2011101411A|2011-05-19|
HU228954B1|2013-07-29|
JP4700704B2|2011-06-15|
EE05594B1|2012-10-15|
US20050220353A1|2005-10-06|
HU0400295A2|2004-08-30|
ZA200308785B|2004-09-16|
SG167662A1|2011-01-28|
US20030112864A1|2003-06-19|
BR0211263A|2004-07-20|
ES2540583T7|2017-07-10|
RU2004101290A|2005-06-20|
AU2002324085C1|2008-06-05|
EP1433316B3|2017-01-18|
MXPA04000203A|2004-03-18|
EP1433316B1|2015-04-29|
US20080069203A1|2008-03-20|
KR100972850B1|2010-07-28|
US7280599B2|2007-10-09|
CN1331353C|2007-08-08|
ES2540583T3|2015-07-10|
KR20110115181A|2011-10-20|
RU2477575C2|2013-03-10|
CN101232622A|2008-07-30|
JP5502765B2|2014-05-28|
RU2317654C2|2008-02-20|
RU2007133925A|2009-03-20|
CA2452632C|2013-04-23|
JP2008187727A|2008-08-14|
CN1537384A|2004-10-13|
JP2005503734A|2005-02-03|
US8036273B2|2011-10-11|
EP1433316B9|2017-04-12|
PT1433316E|2015-07-31|
HK1118411A1|2009-02-06|
WO2003026296A1|2003-03-27|
JP4698947B2|2011-06-08|
KR20040036902A|2004-05-03|
KR20080007276A|2008-01-17|
EP1433316A1|2004-06-30|
EE200400046A|2004-04-15|
AU2002324085B2|2007-09-06|
CA2452632A1|2003-03-27|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

EP0294958B1|1987-06-09|1995-08-23|Sony Corporation|Motion compensated interpolation of digital television images|
GB8713454D0|1987-06-09|1987-07-15|Sony Corp|Television standards converters|
US4816913A|1987-11-16|1989-03-28|Technology, Inc., 64|Pixel interpolation circuitry as for a video signal processor|
US4937666A|1989-12-04|1990-06-26|Bell Communications Research, Inc.|Circuit implementation of block matching algorithm with fractional precision|
GB2249906B|1990-11-15|1994-04-27|Sony Broadcast & Communication|Motion compensated interpolation of images|
JP2861462B2|1991-04-12|1999-02-24|ソニー株式会社|Motion vector detection device|
US5337088A|1991-04-18|1994-08-09|Matsushita Electric Industrial Co. Ltd.|Method of correcting an image signal decoded in block units|
US5430811A|1991-12-25|1995-07-04|Matsushita Electric Industrial Co., Ltd.|Method for interpolating missing pixels and an apparatus employing the method|
US5594813A|1992-02-19|1997-01-14|Integrated Information Technology, Inc.|Programmable architecture and methods for motion estimation|
JP2636622B2|1992-03-13|1997-07-30|松下電器産業株式会社|Video signal encoding method and decoding method, and video signal encoding apparatus and decoding apparatus|
US5461423A|1992-05-29|1995-10-24|Sony Corporation|Apparatus for generating a motion vector with half-pixel precision for use in compressing a digital motion picture signal|
JP2723199B2|1992-06-03|1998-03-09|シャープ株式会社|Tracking servo pull-in circuit device for optical disk player|
KR100283343B1|1992-06-25|2001-03-02|이데이 노부유끼|Image signal encoding method and decoding method, image signal encoding apparatus and decoding apparatus|
JPH06197334A|1992-07-03|1994-07-15|Sony Corp|Picture signal coding method, picture signal decoding method, picture signal coder, picture signal decoder and picture signal recording medium|
KR970000761B1|1992-10-07|1997-01-18|대우전자 주식회사|Mini high-definition television|
EP0892562A3|1993-04-09|1999-01-27|Sony Corporation|Picture encoding method, picture encoding apparatus and picture recording medium|
JP2967014B2|1993-05-24|1999-10-25|キヤノン株式会社|Image processing device|
KR100318786B1|1993-06-01|2002-04-22|똥송 멀티메디아 에스. 에이.|Motion compensated interpolation method and apparatus|
US5684538A|1994-08-18|1997-11-04|Hitachi, Ltd.|System and method for performing video coding/decoding using motion compensation|
JP3392564B2|1995-02-27|2003-03-31|三洋電機株式会社|Single-panel color video camera|
JPH09102954A|1995-10-04|1997-04-15|Matsushita Electric Ind Co Ltd|Method for calculating picture element value of block from one or two predictive blocks|
US5991463A|1995-11-08|1999-11-23|Genesis Microchip Inc.|Source data interpolation method and apparatus|
KR100192270B1|1996-02-03|1999-06-15|구자홍|The video decoding circuit in hdtv|
KR100226684B1|1996-03-22|1999-10-15|전주범|A half pel motion estimator|
JP3224514B2|1996-08-21|2001-10-29|シャープ株式会社|Video encoding device and video decoding device|
RU2131172C1|1996-12-10|1999-05-27|Полыковский Андрей Маркович|Interpolation method for compressing tv signal|
DE19730305A1|1997-07-15|1999-01-21|Bosch Gmbh Robert|Method for generating an improved image signal in the motion estimation of image sequences, in particular a prediction signal for moving images with motion-compensating prediction|
DE19746214A1|1997-10-21|1999-04-22|Bosch Gmbh Robert|Movement compensated prediction method for moving image sequences|
US6122017A|1998-01-22|2000-09-19|Hewlett-Packard Company|Method for providing motion-compensated multi-field enhancement of still images from video|
EP0993656A1|1998-04-29|2000-04-19|Koninklijke Philips Electronics N.V.|Image interpolation|
US6252576B1|1998-08-06|2001-06-26|In-System Design, Inc.|Hardware-efficient system for hybrid-bilinear image scaling|
EP1083752A1|1999-09-08|2001-03-14|STMicroelectronics S.r.l.|Video decoder with reduced memory|
JP4599672B2|1999-12-21|2010-12-15|株式会社ニコン|Interpolation processing apparatus and recording medium recording interpolation processing program|
US6950469B2|2001-09-17|2005-09-27|Nokia Corporation|Method for sub-pixel value interpolation|KR100311482B1|1999-10-21|2001-10-18|구자홍|Method of filtering control of image bilinear interpolation|
US7266150B2|2001-07-11|2007-09-04|Dolby Laboratories, Inc.|Interpolation of video compression frames|
US7082450B2|2001-08-30|2006-07-25|Nokia Corporation|Implementation of a transform and of a subsequent quantization|
US6950469B2|2001-09-17|2005-09-27|Nokia Corporation|Method for sub-pixel value interpolation|
CN1298171C|2001-09-18|2007-01-31|松下电器产业株式会社|Image encoding method and image decoding method|
US20030059089A1|2001-09-25|2003-03-27|Quinlan James E.|Block matching at the fractional pixel level for motion estimation|
US7630566B2|2001-09-25|2009-12-08|Broadcom Corporation|Method and apparatus for improved estimation and compensation in digital video compression and decompression|
US7181070B2|2001-10-30|2007-02-20|Altera Corporation|Methods and apparatus for multiple stage video decoding|
EP1315124A3|2001-11-13|2004-08-18|Trusight Ltd.|Image compression with dynamic programming|
CN101448162B|2001-12-17|2013-01-02|微软公司|Method for processing video image|
EP1469682A4|2002-01-24|2010-01-27|Hitachi Ltd|Moving picture signal coding method, decoding method, coding apparatus, and decoding apparatus|
US8175159B2|2002-01-24|2012-05-08|Hitachi, Ltd.|Moving picture signal coding method, decoding method, coding apparatus, and decoding apparatus|
US7003035B2|2002-01-25|2006-02-21|Microsoft Corporation|Video coding methods and apparatuses|
US8284844B2|2002-04-01|2012-10-09|Broadcom Corporation|Video decoding system supporting multiple standards|
US7110459B2|2002-04-10|2006-09-19|Microsoft Corporation|Approximate bicubic filter|
US7305034B2|2002-04-10|2007-12-04|Microsoft Corporation|Rounding control for multi-stage interpolation|
US7116831B2|2002-04-10|2006-10-03|Microsoft Corporation|Chrominance motion vector rounding|
US7620109B2|2002-04-10|2009-11-17|Microsoft Corporation|Sub-pixel interpolation in motion estimation and compensation|
US20040001546A1|2002-06-03|2004-01-01|Alexandros Tourapis|Spatiotemporal prediction for bidirectionally predictivepictures and motion vector prediction for multi-picture reference motion compensation|
US7154952B2|2002-07-19|2006-12-26|Microsoft Corporation|Timestamp-independent motion vector prediction for predictiveand bidirectionally predictivepictures|
KR100472476B1|2002-08-31|2005-03-10|삼성전자주식회사|Interpolation apparatus and method for moving vector compensation|
US7400774B2|2002-09-06|2008-07-15|The Regents Of The University Of California|Encoding and decoding of digital data using cues derivable at a decoder|
US7231090B2|2002-10-29|2007-06-12|Winbond Electronics Corp.|Method for performing motion estimation with Walsh-Hadamard transform |
US7408988B2|2002-12-20|2008-08-05|Lsi Corporation|Motion estimation engine with parallel interpolation and search hardware|
US7212676B2|2002-12-30|2007-05-01|Intel Corporation|Match MSB digital image compression|
JP3997171B2|2003-03-27|2007-10-24|株式会社エヌ・ティ・ティ・ドコモ|Moving picture encoding apparatus, moving picture encoding method, moving picture encoding program, moving picture decoding apparatus, moving picture decoding method, and moving picture decoding program|
US9330060B1|2003-04-15|2016-05-03|Nvidia Corporation|Method and device for encoding and decoding video image data|
US8660182B2|2003-06-09|2014-02-25|Nvidia Corporation|MPEG motion estimation based on dual start points|
US7426308B2|2003-07-18|2008-09-16|Microsoft Corporation|Intraframe and interframe interlace coding and decoding|
US10554985B2|2003-07-18|2020-02-04|Microsoft Technology Licensing, Llc|DC coefficient signaling at small quantization step sizes|
US7738554B2|2003-07-18|2010-06-15|Microsoft Corporation|DC coefficient signaling at small quantization step sizes|
US20050013498A1|2003-07-18|2005-01-20|Microsoft Corporation|Coding of motion vector information|
US7567617B2|2003-09-07|2009-07-28|Microsoft Corporation|Predicting motion vectors for fields of forward-predicted interlaced video frames|
US8064520B2|2003-09-07|2011-11-22|Microsoft Corporation|Advanced bi-directional predictive coding of interlaced video|
US7724827B2|2003-09-07|2010-05-25|Microsoft Corporation|Multi-layer run level encoding and decoding|
US7317839B2|2003-09-07|2008-01-08|Microsoft Corporation|Chroma motion vector derivation for interlaced forward-predicted fields|
US7253374B2|2003-09-15|2007-08-07|General Motors Corporation|Sheet-to-tube welded structure and method|
NO319629B1|2003-11-28|2005-09-05|Tandberg Telecom As|Procedure for correcting interpolated pixel values|
NO320114B1|2003-12-05|2005-10-24|Tandberg Telecom As|Improved calculation of interpolated pixel values|
EP1578137A2|2004-03-17|2005-09-21|Matsushita Electric Industrial Co., Ltd.|Moving picture coding apparatus with multistep interpolation process|
JP4419062B2|2004-03-29|2010-02-24|ソニー株式会社|Image processing apparatus and method, recording medium, and program|
WO2005096632A1|2004-03-31|2005-10-13|Koninklijke Philips Electronics N.V.|Motion estimation and segmentation for video data|
CN1926882B|2004-04-21|2010-10-06|松下电器产业株式会社|Motion compensating apparatus|
KR100605105B1|2004-05-28|2006-07-26|삼성전자주식회사|Apparatus of Image Interpolation|
US7565020B2|2004-07-03|2009-07-21|Microsoft Corp.|System and method for image coding employing a hybrid directional prediction and wavelet lifting|
CN100377599C|2004-09-03|2008-03-26|北京航空航天大学|A fast sub-picture element movement estimating method|
US20060088104A1|2004-10-27|2006-04-27|Stephen Molloy|Non-integer pixel sharing for video encoding|
US7792192B2|2004-11-19|2010-09-07|Analog Devices, Inc.|System and method for sub-pixel interpolation in motion vector estimation|
JP4277793B2|2004-12-17|2009-06-10|ソニー株式会社|Image processing apparatus, encoding apparatus, and methods thereof|
US7668455B2|2004-12-20|2010-02-23|Fujifilm Corporation|Image capturing apparatus, image capturing method, reproducing apparatus, reproducing method and program|
US7653132B2|2004-12-21|2010-01-26|Stmicroelectronics, Inc.|Method and system for fast implementation of subpixel interpolation|
CN100411435C|2005-01-24|2008-08-13|威盛电子股份有限公司|System and method for decreasing possess memory band width in video coding|
JP4736456B2|2005-02-15|2011-07-27|株式会社日立製作所|Scanning line interpolation device, video display device, video signal processing device|
US8175168B2|2005-03-18|2012-05-08|Sharp Laboratories Of America, Inc.|Methods and systems for picture up-sampling|
WO2006106039A1|2005-04-06|2006-10-12|Thomson Licensing|Method and apparatus for encoding enhancement layer video data|
JP4081103B2|2005-05-11|2008-04-23|株式会社東芝|Video encoding device|
US7526419B2|2005-05-24|2009-04-28|International Business Machines Corporation|Methods for reconstructing data from simulation models|
KR101293078B1|2005-07-28|2013-08-16|톰슨 라이센싱|Motion estimation and compensation using a hierarchical cache|
US20070040837A1|2005-08-19|2007-02-22|Seok Jin W|Motion vector estimation method and continuous picture generation method based on convexity property of sub pixel|
KR100623036B1|2005-09-22|2006-09-13|삼익전자공업 주식회사|Electric signboard system improved resolution with dynamic interpolation scanning|
CN1859576A|2005-10-11|2006-11-08|华为技术有限公司|Top sampling method and its system for space layered coding video image|
US8265151B1|2005-12-14|2012-09-11|Ambarella Taiwan Ltd.|Mode decision using approximate 1/2 pel interpolation|
US8731071B1|2005-12-15|2014-05-20|Nvidia Corporation|System for performing finite input responsefiltering in motion estimation|
US20070146242A1|2005-12-22|2007-06-28|Eastman Kodak Company|High resolution display for monochrome images with color highlighting|
CN1794821A|2006-01-11|2006-06-28|浙江大学|Method and device of interpolation in grading video compression|
JP4677351B2|2006-02-17|2011-04-27|キヤノン株式会社|Motion compensator, motion compensation processing method, computer program, and storage medium|
US8724702B1|2006-03-29|2014-05-13|Nvidia Corporation|Methods and systems for motion estimation used in video coding|
WO2007116551A1|2006-03-30|2007-10-18|Kabushiki Kaisha Toshiba|Image coding apparatus and image coding method, and image decoding apparatus and image decoding method|
WO2007114368A1|2006-03-30|2007-10-11|Kabushiki Kaisha Toshiba|Image coding apparatus and method, and image decoding apparatus and method|
US8208553B2|2006-05-04|2012-06-26|Altera Corporation|Methods and apparatus for quarter-pel refinement in a SIMD array processor|
JP4682384B2|2006-07-11|2011-05-11|株式会社メガチップス|1/4 pixel luminance motion prediction mechanism, combined luminance motion prediction mechanism, and combined luminance / color difference motion prediction mechanism|
US8253752B2|2006-07-20|2012-08-28|Qualcomm Incorporated|Method and apparatus for encoder assisted pre-processing|
US8155454B2|2006-07-20|2012-04-10|Qualcomm Incorporated|Method and apparatus for encoder assisted post-processing|
US8660380B2|2006-08-25|2014-02-25|Nvidia Corporation|Method and system for performing two-dimensional transform on data value array with reduced power consumption|
KR100804451B1|2006-09-25|2008-02-20|광운대학교 산학협력단|1/4 quarter pixel interpolation method for imaging process and processor thereof|
US9307122B2|2006-09-27|2016-04-05|Core Wireless Licensing S.A.R.L.|Method, apparatus, and computer program product for providing motion estimation for video encoding|
KR100827093B1|2006-10-13|2008-05-02|삼성전자주식회사|Method for video encoding and apparatus for the same|
KR100800761B1|2006-10-19|2008-02-01|삼성전자주식회사|Apparatus and method of interpolationing chroma signal for minimization of calculation load|
KR101354659B1|2006-11-08|2014-01-28|삼성전자주식회사|Method and apparatus for motion compensation supporting multicodec|
KR100874949B1|2006-11-15|2008-12-19|삼성전자주식회사|Single instruction multiple data processor and memory array structure for it|
JP4753204B2|2006-11-17|2011-08-24|株式会社ソニー・コンピュータエンタテインメント|Encoding processing apparatus and encoding processing method|
US8411709B1|2006-11-27|2013-04-02|Marvell International Ltd.|Use of previously buffered state information to decode in an hybrid automatic repeat requesttransmission mode|
JP2008165381A|2006-12-27|2008-07-17|Ricoh Co Ltd|Image processing device and image processing method|
KR101411315B1|2007-01-22|2014-06-26|삼성전자주식회사|Method and apparatus for intra/inter prediction|
US8296662B2|2007-02-05|2012-10-23|Brother Kogyo Kabushiki Kaisha|Image display device|
CA2681210C|2007-04-09|2021-03-02|Nokia Corporation|High accuracy motion vectors for video coding with low encoder and decoder complexity|
US8756482B2|2007-05-25|2014-06-17|Nvidia Corporation|Efficient encoding/decoding of a sequence of data frames|
US9118927B2|2007-06-13|2015-08-25|Nvidia Corporation|Sub-pixel interpolation and its application in motion compensated encoding of a video signal|
KR101380615B1|2007-06-28|2014-04-14|삼성전자주식회사|Method and apparatus for improving dynamic range of images|
US8254455B2|2007-06-30|2012-08-28|Microsoft Corporation|Computing collocated macroblock information for direct mode macroblocks|
US8509567B2|2007-07-09|2013-08-13|Analog Devices, Inc.|Half pixel interpolator for video motion estimation accelerator|
US8873625B2|2007-07-18|2014-10-28|Nvidia Corporation|Enhanced compression in representing non-frame-edge blocks of image frames|
KR101396365B1|2007-08-28|2014-05-30|삼성전자주식회사|Method and apparatus for spatiotemporal motion estimation and motion compensation of video|
KR100909390B1|2007-09-18|2009-07-24|한국과학기술원|High speed motion compensation device and method|
JP4461165B2|2007-09-26|2010-05-12|株式会社東芝|Image processing apparatus, method, and program|
JP4900175B2|2007-10-04|2012-03-21|セイコーエプソン株式会社|Image processing apparatus and method, and program|
AU2008306503A1|2007-10-05|2009-04-09|Nokia Corporation|Video coding with pixel-aligned directional adaptive interpolation filters|
US8416861B2|2007-10-14|2013-04-09|Nokia Corporation|Fixed-point implementation of an adaptive image filter with high coding efficiency|
US8897393B1|2007-10-16|2014-11-25|Marvell International Ltd.|Protected codebook selection at receiver for transmit beamforming|
US8542725B1|2007-11-14|2013-09-24|Marvell International Ltd.|Decision feedback equalization for signals having unequally distributed patterns|
TWI389573B|2007-12-06|2013-03-11|Mstar Semiconductor Inc|Image processing method and related apparatus for performing image processing operation only according to image blocks in horizontal direction|
KR101456487B1|2008-03-04|2014-10-31|삼성전자주식회사|Method and apparatus for encoding and decoding using sub-pixel motion prediction|
US8565325B1|2008-03-18|2013-10-22|Marvell International Ltd.|Wireless device communication in the 60GHz band|
US8971412B2|2008-04-10|2015-03-03|Qualcomm Incorporated|Advanced interpolation techniques for motion compensation in video coding|
US20090257499A1|2008-04-10|2009-10-15|Qualcomm Incorporated|Advanced interpolation techniques for motion compensation in video coding|
US8831086B2|2008-04-10|2014-09-09|Qualcomm Incorporated|Prediction techniques for interpolation in video coding|
US9077971B2|2008-04-10|2015-07-07|Qualcomm Incorporated|Interpolation-like filtering of integer-pixel positions in video coding|
US9967590B2|2008-04-10|2018-05-08|Qualcomm Incorporated|Rate-distortion defined interpolation for video coding based on fixed filter or adaptive filter|
US8705622B2|2008-04-10|2014-04-22|Qualcomm Incorporated|Interpolation filter support for sub-pixel resolution in video coding|
US8462842B2|2008-04-10|2013-06-11|Qualcomm, Incorporated|Symmetry for interpolation filtering of sub-pixel positions in video coding|
RU2010145524A|2008-04-10|2012-05-20|Квэлкомм Инкорпорейтед |SYMMETRY FOR INTERPOLATION FILTRATION OF SUB-PIXEL POSITIONS IN VIDEO ENCODING|
US8804831B2|2008-04-10|2014-08-12|Qualcomm Incorporated|Offsets at sub-pixel resolution|
EP2304963B1|2008-07-01|2015-11-11|Orange|Method and device for encoding images using improved prediction, and corresponding decoding method and device, signal and computer software|
US8811484B2|2008-07-07|2014-08-19|Qualcomm Incorporated|Video encoding by filter selection|
AU2009269607B2|2008-07-08|2015-03-26|Nortech InternationalLimited|Apparatus and method of classifying movement of objects in a monitoring zone|
JP2010028220A|2008-07-15|2010-02-04|Sony Corp|Motion vector detecting device, motion vector detecting method, image encoding device, and program|
KR101638206B1|2008-07-29|2016-07-08|오렌지|Method for updating an encoder by filter interpolation|
US8498342B1|2008-07-29|2013-07-30|Marvell International Ltd.|Deblocking filtering|
US8761261B1|2008-07-29|2014-06-24|Marvell International Ltd.|Encoding using motion vectors|
CN102113326A|2008-08-04|2011-06-29|杜比实验室特许公司|Overlapped block disparity estimation and compensation architecture|
US8345533B1|2008-08-18|2013-01-01|Marvell International Ltd.|Frame synchronization techniques|
US8750378B2|2008-09-23|2014-06-10|Qualcomm Incorporated|Offset calculation in switched interpolation filters|
US8131056B2|2008-09-30|2012-03-06|International Business Machines Corporation|Constructing variability maps by correlating off-state leakage emission images to layout information|
US8681893B1|2008-10-08|2014-03-25|Marvell International Ltd.|Generating pulses using a look-up table|
US8666181B2|2008-12-10|2014-03-04|Nvidia Corporation|Adaptive multiple engine image motion detection system and method|
US20100165078A1|2008-12-30|2010-07-01|Sensio Technologies Inc.|Image compression using checkerboard mosaic for luminance and chrominance color space images|
US20100166076A1|2008-12-30|2010-07-01|Tandberg Telecom As|Method, apparatus, and computer readable medium for calculating run and level representations of quantized transform coefficients representing pixel values included in a block of a video picture|
JP2010161747A|2009-01-09|2010-07-22|Toshiba Corp|Apparatus and method for generating sub-pixel, and motion compensating apparatus|
US8189666B2|2009-02-02|2012-05-29|Microsoft Corporation|Local picture identifier and computation of co-located information|
JP5580541B2|2009-03-06|2014-08-27|パナソニック株式会社|Image decoding apparatus and image decoding method|
US8520771B1|2009-04-29|2013-08-27|Marvell International Ltd.|WCDMA modulation|
US7991245B2|2009-05-29|2011-08-02|Putman Matthew C|Increasing image resolution method employing known background and specimen|
JP2011030184A|2009-07-01|2011-02-10|Sony Corp|Image processing apparatus, and image processing method|
JP5325745B2|2009-11-02|2013-10-23|株式会社ソニー・コンピュータエンタテインメント|Moving image processing program, apparatus and method, and imaging apparatus equipped with moving image processing apparatus|
KR101601848B1|2009-12-01|2016-03-10|에스케이 텔레콤주식회사|Apparatus and method for generating inter prediction frame and reference frame interpolation apparatus and method therefor|
US8406537B2|2009-12-17|2013-03-26|General Electric Company|Computed tomography system with data compression and transfer|
EP2514210A4|2009-12-17|2014-03-19|Ericsson Telefon Ab L M|Method and arrangement for video coding|
US20110200108A1|2010-02-18|2011-08-18|Qualcomm Incorporated|Chrominance high precision motion filtering for motion interpolation|
JP2011199396A|2010-03-17|2011-10-06|Ntt Docomo Inc|Moving image prediction encoding device, moving image prediction encoding method, moving image prediction encoding program, moving image prediction decoding device, moving image prediction decoding method, and moving image prediction decoding program|
KR101682147B1|2010-04-05|2016-12-05|삼성전자주식회사|Method and apparatus for interpolation based on transform and inverse transform|
KR101847072B1|2010-04-05|2018-04-09|삼성전자주식회사|Method and apparatus for video encoding, and method and apparatus for video decoding|
US9219921B2|2010-04-12|2015-12-22|Qualcomm Incorporated|Mixed tap filters|
TWI678916B|2010-04-13|2019-12-01|美商Ge影像壓縮有限公司|Sample region merging|
DK2559245T3|2010-04-13|2015-08-24|Ge Video Compression Llc|Video Coding using multitræsunderinddeling Images|
CN106060558B|2010-04-13|2019-08-13|Ge视频压缩有限责任公司|Decoder, the method for rebuilding array, encoder, coding method|
BR122020007923B1|2010-04-13|2021-08-03|Ge Video Compression, Llc|INTERPLANE PREDICTION|
US8963996B2|2010-05-05|2015-02-24|Samsung Electronics Co., Ltd.|Communication of stereoscopic three-dimensionalvideo information including an uncompressed eye view video frames|
TWI423164B|2010-05-07|2014-01-11|Silicon Motion Inc|Method for generating a high quality up-scaled image, and associated device|
ES2703005T3|2010-05-07|2019-03-06|Nippon Telegraph & Telephone|Control method of animated image coding, animated image coding apparatus and animated image coding program|
CN102402781B|2010-09-13|2014-05-14|慧荣科技股份有限公司|Method and relevant device for generating high-quality enlarged image|
JP5286581B2|2010-05-12|2013-09-11|日本電信電話株式会社|Moving picture coding control method, moving picture coding apparatus, and moving picture coding program|
US8447105B2|2010-06-07|2013-05-21|Microsoft Corporation|Data driven interpolation using geodesic affinity|
WO2012005558A2|2010-07-09|2012-01-12|삼성전자 주식회사|Image interpolation method and apparatus|
US8817771B1|2010-07-16|2014-08-26|Marvell International Ltd.|Method and apparatus for detecting a boundary of a data frame in a communication network|
US20120027081A1|2010-07-30|2012-02-02|Cisco Technology Inc.|Method, system, and computer readable medium for implementing run-level coding|
US20120063515A1|2010-09-09|2012-03-15|Qualcomm Incorporated|Efficient Coding of Video Parameters for Weighted Motion Compensated Prediction in Video Coding|
US10045046B2|2010-12-10|2018-08-07|Qualcomm Incorporated|Adaptive support for interpolating values of sub-pixels for video coding|
PT2996334T|2010-12-21|2018-09-28|Ntt Docomo Inc|Intra-prediction coding and decoding under a planar mode|
US9445126B2|2011-01-05|2016-09-13|Qualcomm Incorporated|Video filtering using a combination of one-dimensional switched filter and one-dimensional adaptive filter|
CN102595118B|2011-01-14|2015-04-08|华为技术有限公司|Prediction method and predictor in encoding and decoding|
US8797391B2|2011-01-14|2014-08-05|Himax Media Solutions, Inc.|Stereo image displaying method|
US9049454B2|2011-01-19|2015-06-02|Google Technology Holdings Llc.|High efficiency low complexity interpolation filters|
US20120224639A1|2011-03-03|2012-09-06|General Instrument Corporation|Method for interpolating half pixels and quarter pixels|
US20120230407A1|2011-03-11|2012-09-13|General Instrument Corporation|Interpolation Filter Selection Using Prediction Index|
JP5768491B2|2011-05-17|2015-08-26|ソニー株式会社|Image processing apparatus and method, program, and recording medium|
MX337794B|2011-06-24|2016-03-18|Ntt Docomo Inc|Method and apparatus for motion compensation prediction.|
EP2724534A2|2011-06-24|2014-04-30|Motorola Mobility LLC|Selection of phase offsets for interpolation filters for motion compensation|
PL3448025T3|2011-06-28|2020-04-30|Samsung Electronics Co., Ltd.|Image interpolation using asymmetric interpolation filter|
CN102857752B|2011-07-01|2016-03-30|华为技术有限公司|A kind of pixel prediction method and apparatus|
BR112013033743A2|2011-07-01|2019-09-24|Motorola Mobility Inc|subpixel interpolation filter set for temporal prediction|
US9129411B2|2011-07-21|2015-09-08|Luca Rossato|Upsampling in a tiered signal quality hierarchy|
PL2744204T3|2011-09-14|2019-03-29|Samsung Electronics Co., Ltd.|Method for decoding a prediction unitbased on its size|
US20130070091A1|2011-09-19|2013-03-21|Michael Mojaver|Super resolution imaging and tracking system|
US10924668B2|2011-09-19|2021-02-16|Epilog Imaging Systems|Method and apparatus for obtaining enhanced resolution images|
US9137433B2|2011-09-19|2015-09-15|Michael Mojaver|Super resolution binary imaging and tracking system|
RU2473124C1|2011-09-23|2013-01-20|Общество С Ограниченной Ответственностью "Виси Рус"|Method of detecting pornography on digital images |
US20130083845A1|2011-09-30|2013-04-04|Research In Motion Limited|Methods and devices for data compression using a non-uniform reconstruction space|
ES2816567T3|2011-10-24|2021-04-05|Innotive Ltd|Method and apparatus for decoding intra-prediction mode|
EP2595382B1|2011-11-21|2019-01-09|BlackBerry Limited|Methods and devices for encoding and decoding transform domain filters|
JP5911166B2|2012-01-10|2016-04-27|シャープ株式会社|Image processing apparatus, image processing method, image processing program, imaging apparatus, and image display apparatus|
US9325991B2|2012-04-11|2016-04-26|Qualcomm Incorporated|Motion vector rounding|
US8819525B1|2012-06-14|2014-08-26|Google Inc.|Error concealment guided robustness|
US9041834B2|2012-09-19|2015-05-26|Ziilabs Inc., Ltd.|Systems and methods for reducing noise in video streams|
JP5730274B2|2012-11-27|2015-06-03|京セラドキュメントソリューションズ株式会社|Image processing device|
JP5697649B2|2012-11-27|2015-04-08|京セラドキュメントソリューションズ株式会社|Image processing device|
US9432690B2|2013-01-30|2016-08-30|Ati Technologies Ulc|Apparatus and method for video processing|
US9225979B1|2013-01-30|2015-12-29|Google Inc.|Remote access encoding|
US20140267916A1|2013-03-12|2014-09-18|Tandent Vision Science, Inc.|Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression|
US20140269943A1|2013-03-12|2014-09-18|Tandent Vision Science, Inc.|Selective perceptual masking via downsampling in the spatial and temporal domains using intrinsic images for use in data compression|
CN105144701A|2013-05-01|2015-12-09|Lg电子株式会社|Apparatus and method of transmitting and receiving signal|
AU2013213660A1|2013-08-06|2015-02-26|Canon Kabushiki Kaisha|Method for printing an upscaled image|
US9774881B2|2014-01-08|2017-09-26|Microsoft Technology Licensing, Llc|Representing motion vectors in an encoded bitstream|
US9942560B2|2014-01-08|2018-04-10|Microsoft Technology Licensing, Llc|Encoding screen capture data|
US9749642B2|2014-01-08|2017-08-29|Microsoft Technology Licensing, Llc|Selection of motion vector precision|
CN103793917B|2014-02-24|2017-02-01|哈尔滨工程大学|Remote sensing image sub-pixel positioning method combining two interpolation algorithms|
US10462480B2|2014-12-31|2019-10-29|Microsoft Technology Licensing, Llc|Computationally efficient motion estimation|
US10291932B2|2015-03-06|2019-05-14|Qualcomm Incorporated|Method and apparatus for low complexity quarter pel generation in motion search|
US10283031B2|2015-04-02|2019-05-07|Apple Inc.|Electronic device with image processor to reduce color motion blur|
US10275863B2|2015-04-03|2019-04-30|Cognex Corporation|Homography rectification|
US9542732B2|2015-04-03|2017-01-10|Cognex Corporation|Efficient image transformation|
US10009622B1|2015-12-15|2018-06-26|Google Llc|Video coding with degradation of residuals|
US10116957B2|2016-09-15|2018-10-30|Google Inc.|Dual filter type for motion compensated prediction in video coding|
CN106658024B|2016-10-20|2019-07-16|杭州当虹科技股份有限公司|A kind of quick method for video coding|
US10499078B1|2017-02-07|2019-12-03|Google Llc|Implicit motion compensation filter selection|
CN106998437B|2017-03-31|2020-07-31|武汉斗鱼网络科技有限公司|Method and device for reconstructing video image|
CN109922329B|2017-12-13|2021-02-26|北京传送科技有限公司|Compression method, decompression method and device for virtual reality image data|
US11044518B2|2018-03-20|2021-06-22|At&T Mobility Ii Llc|Video access master platform|
US11051058B2|2018-09-24|2021-06-29|Hewlett Packard Enterprise Development Lp|Real-time wireless video delivery system using a multi-channel communications link|
CN109348234B|2018-11-12|2021-11-19|北京佳讯飞鸿电气股份有限公司|Efficient sub-pixel motion estimation method and system|
US11006132B1|2019-11-08|2021-05-11|Op Solutions, Llc|Methods and systems for adaptive cropping|
法律状态:
优先权:
申请号 | 申请日 | 专利标题
US09/954,608|US6950469B2|2001-09-17|2001-09-17|Method for sub-pixel value interpolation|
US954608|2001-09-17|
PCT/FI2002/000729|WO2003026296A1|2001-09-17|2002-09-11|Method for sub-pixel value interpolation|
[返回顶部]